Open In Colab

Problem Statement¶

Business Context¶

Renewable energy sources play an increasingly important role in the global energy mix, as the effort to reduce the environmental impact of energy production increases.

Out of all the renewable energy alternatives, wind energy is one of the most developed technologies worldwide. The U.S Department of Energy has put together a guide to achieving operational efficiency using predictive maintenance practices.

Predictive maintenance uses sensor information and analysis methods to measure and predict degradation and future component capability. The idea behind predictive maintenance is that failure patterns are predictable and if component failure can be predicted accurately and the component is replaced before it fails, the costs of operation and maintenance will be much lower.

The sensors fitted across different machines involved in the process of energy generation collect data related to various environmental factors (temperature, humidity, wind speed, etc.) and additional features related to various parts of the wind turbine (gearbox, tower, blades, break, etc.).

Objective¶

“ReneWind” is a company working on improving the machinery/processes involved in the production of wind energy using machine learning and has collected data of generator failure of wind turbines using sensors. They have shared a ciphered version of the data, as the data collected through sensors is confidential (the type of data collected varies with companies). Data has 40 predictors, 20000 observations in the training set and 5000 in the test set.

The objective is to build various classification models, tune them, and find the best one that will help identify failures so that the generators could be repaired before failing/breaking to reduce the overall maintenance cost. The nature of predictions made by the classification model will translate as follows:

  • True positives (TP) are failures correctly predicted by the model. These will result in repairing costs.
  • False negatives (FN) are real failures where there is no detection by the model. These will result in replacement costs.
  • False positives (FP) are detections where there is no failure. These will result in inspection costs.

It is given that the cost of repairing a generator is much less than the cost of replacing it, and the cost of inspection is less than the cost of repair.

“1” in the target variables should be considered as “failure” and “0” represents “No failure”.

Data Description¶

The data provided is a transformed version of the original data which was collected using sensors.

  • Train.csv - To be used for training and tuning of models.
  • Test.csv - To be used only for testing the performance of the final best model.

Both the datasets consist of 40 predictor variables and 1 target variable.

1 - Installing and Importing the necessary libraries¶

The objective is to build classification neural network model to predict the turbine failures.

Instruction: Restart the runtime after installing libraries to ensure correct package versions and ignore dependency warnings.

In [ ]:
# Installing the libraries with the specified version
!pip install tensorflow==2.18.0 scikit-learn==1.3.2 matplotlib===3.8.3 seaborn==0.13.2 numpy==1.26.4 pandas==2.2.2 -q --user --no-warn-script-location --no-deps
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.0/61.0 kB 2.2 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 615.5/615.5 MB 2.6 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.8/10.8 MB 42.4 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.6/11.6 MB 67.2 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.0/18.0 MB 69.2 MB/s eta 0:00:00
In [ ]:
# Libraries for data manipulation, analysis and scientific computing
import pandas as pd
import numpy as np

# Libraries to help with data visualization
import matplotlib.pyplot as plt
import seaborn as sns

# Library for time related functions
import time

# For splitting datasets into training and testing sets.
from sklearn.model_selection import train_test_split
# Tools for data preprocessing including label encoding, one-hot encoding, and standard scaling
from sklearn.preprocessing import LabelEncoder, OneHotEncoder,StandardScaler
# Imports a class for imputing missing values in datasets.
from sklearn.impute import SimpleImputer

# Imports for evaluating the performance of machine learning models
from sklearn import metrics
from sklearn.metrics import (
    confusion_matrix,
    f1_score,
    accuracy_score,
    recall_score,
    precision_score,
    classification_report
)

# Imports the tensorflow, keras and layers.
import tensorflow
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dense, Input, Dropout, BatchNormalization
from tensorflow.keras import backend

from sklearn.decomposition import PCA

# To suppress unnecessary warnings
import warnings
warnings.filterwarnings("ignore")
In [ ]:
# Removes the limit for the number of displayed columns
pd.set_option("display.max_columns", None)

# Sets the limit for the number of displayed rows
pd.set_option("display.max_rows", 100)

2 - Import Dataset¶

In [1]:
# Mount drive
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
In [ ]:
# Import train and test set
df = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/Project-4/Train.csv")
df_test = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/Project-4/Test.csv")
In [ ]:
# Set the seed using keras.utils.set_random_seed. This will set:
# 1) `numpy` seed that influences operations that involve randomess like shuffling data, initializing weights or dropout layers
# 2) backend random seed that influences operations within keras itself
# 3) `python` random seed
keras.utils.set_random_seed(812)

# Call function so that TensorFlow attempts to make operations more deterministic
tf.config.experimental.enable_op_determinism()

3 - Data Overview¶

3.1 - Shape of the dataset¶

In [ ]:
# Shape of the train data
df.shape
Out[ ]:
(20000, 41)
In [ ]:
# Shape of the test data
df_test.shape
Out[ ]:
(5000, 41)
In [ ]:
# Make a copy of train data and preserve the original
data = df.copy()
In [ ]:
# Make a copy of test data and preserve the original
data_test = df_test.copy()

Observation

  • The train set has 20000 rows and 41 columns
  • The test set has 5000 rows and 41 columns

3.2 - View sample rows of the dataset¶

In [ ]:
# View the first 5 rows of the data
data.head()
Out[ ]:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40 Target
0 -4.464606 -4.679129 3.101546 0.506130 -0.221083 -2.032511 -2.910870 0.050714 -1.522351 3.761892 -5.714719 0.735893 0.981251 1.417884 -3.375815 -3.047303 0.306194 2.914097 2.269979 4.394876 -2.388299 0.646388 -1.190508 3.132986 0.665277 -2.510846 -0.036744 0.726218 -3.982187 -1.072638 1.667098 3.059700 -1.690440 2.846296 2.235198 6.667486 0.443809 -2.369169 2.950578 -3.480324 0
1 3.365912 3.653381 0.909671 -1.367528 0.332016 2.358938 0.732600 -4.332135 0.565695 -0.101080 1.914465 -0.951458 -1.255259 -2.706522 0.193223 -4.769379 -2.205319 0.907716 0.756894 -5.833678 -3.065122 1.596647 -1.757311 1.766444 -0.267098 3.625036 1.500346 -0.585712 0.783034 -0.201217 0.024883 -1.795474 3.032780 -2.467514 1.894599 -2.297780 -1.731048 5.908837 -0.386345 0.616242 0
2 -3.831843 -5.824444 0.634031 -2.418815 -1.773827 1.016824 -2.098941 -3.173204 -2.081860 5.392621 -0.770673 1.106718 1.144261 0.943301 -3.163804 -4.247825 -4.038909 3.688534 3.311196 1.059002 -2.143026 1.650120 -1.660592 1.679910 -0.450782 -4.550695 3.738779 1.134404 -2.033531 0.840839 -1.600395 -0.257101 0.803550 4.086219 2.292138 5.360850 0.351993 2.940021 3.839160 -4.309402 0
3 1.618098 1.888342 7.046143 -1.147285 0.083080 -1.529780 0.207309 -2.493629 0.344926 2.118578 -3.053023 0.459719 2.704527 -0.636086 -0.453717 -3.174046 -3.404347 -1.281536 1.582104 -1.951778 -3.516555 -1.206011 -5.627854 -1.817653 2.124142 5.294642 4.748137 -2.308536 -3.962977 -6.028730 4.948770 -3.584425 -2.577474 1.363769 0.622714 5.550100 -1.526796 0.138853 3.101430 -1.277378 0
4 -0.111440 3.872488 -3.758361 -2.982897 3.792714 0.544960 0.205433 4.848994 -1.854920 -6.220023 1.998347 4.723757 0.709113 -1.989432 -2.632684 4.184447 2.245356 3.734452 -6.312766 -5.379918 -0.886667 2.061694 9.445586 4.489976 -3.945144 4.582065 -8.780422 -3.382967 5.106507 6.787513 2.044184 8.265896 6.629213 -10.068689 1.222987 -3.229763 1.686909 -2.163896 -3.644622 6.510338 0
In [ ]:
# View random 5 rows of the data
data.sample(5)
Out[ ]:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40 Target
12064 -0.764552 -4.941473 0.222119 -5.095314 -1.262965 -1.112442 -2.694133 0.192237 0.986656 -0.559456 -4.743450 4.087388 5.939446 1.494009 -7.019193 -5.897938 0.388389 7.368904 -3.119402 4.363991 -6.347991 4.909830 4.527317 -2.074595 1.592214 -3.220394 2.033211 -1.738571 -3.812661 -0.262135 -3.018894 -2.489202 -4.217574 -1.504447 -0.134008 6.486465 4.214021 -5.031245 2.967805 0.685425 0
1557 -4.089934 3.152208 3.423398 3.999387 2.250355 -2.849602 -0.595100 1.126280 -3.093891 1.350575 -3.070396 4.199762 -0.267420 -2.377282 -2.559785 -0.771482 0.640583 -0.806499 5.603059 -2.381178 -3.099341 -0.086305 0.882630 8.884040 -3.176450 5.453012 -6.105008 0.034906 -1.283661 1.437989 3.343975 8.946837 3.060782 0.563135 5.420494 0.689776 -2.158107 2.568949 0.539156 -4.008018 0
547 -0.029813 -3.022956 -0.265373 -4.457207 0.300206 -0.144165 -0.065150 2.777640 -2.104705 0.885664 -0.241385 2.069763 1.279603 1.945204 0.378906 3.566069 -1.767384 1.783763 -3.252419 -0.464406 1.130130 0.181083 2.219966 -0.920986 0.457871 -1.776223 2.003802 -1.741961 -0.360826 0.007331 3.710043 1.382491 -0.363915 -1.435063 -0.912192 4.796112 1.195906 -3.659618 0.664196 2.633353 0
13830 2.898566 -1.006482 -2.702558 -4.060640 -1.094510 0.348894 3.174205 4.285828 -1.695746 -2.051190 2.108290 4.125534 -3.177813 0.653460 6.999765 6.382671 3.220894 -0.714219 -2.327440 1.603711 5.611470 -0.624359 2.027053 -4.438849 2.635511 -5.090026 2.779883 0.940352 2.369909 -0.851931 -1.012343 -7.683395 -5.242774 1.483598 -5.274655 0.553090 3.235861 -1.345798 -0.920241 6.664309 1
19580 1.231375 -2.878551 5.776283 -3.307992 -1.662522 -1.616792 -1.254765 -0.436271 0.538919 2.077890 -2.120219 0.285741 5.414061 1.640368 -2.289312 -1.484388 -4.562397 0.093081 -0.771748 1.286346 -3.793965 -0.772187 -3.990718 -5.192889 2.357024 1.240530 5.766191 -3.332751 -2.845104 -3.297941 4.235937 -3.400158 -2.351994 0.767014 1.168552 7.102530 0.006747 -5.306493 2.990179 -0.850073 0
In [ ]:
# View the first 5 rows of the test data
data_test.head()
Out[ ]:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40 Target
0 -0.613489 -3.819640 2.202302 1.300420 -1.184929 -4.495964 -1.835817 4.722989 1.206140 -0.341909 -5.122874 1.017021 4.818549 3.269001 -2.984330 1.387370 2.032002 -0.511587 -1.023069 7.338733 -2.242244 0.155489 2.053786 -2.772273 1.851369 -1.788696 -0.277282 -1.255143 -3.832886 -1.504542 1.586765 2.291204 -5.411388 0.870073 0.574479 4.157191 1.428093 -10.511342 0.454664 -1.448363 0
1 0.389608 -0.512341 0.527053 -2.576776 -1.016766 2.235112 -0.441301 -4.405744 -0.332869 1.966794 1.796544 0.410490 0.638328 -1.389600 -1.883410 -5.017922 -3.827238 2.418060 1.762285 -3.242297 -3.192960 1.857454 -1.707954 0.633444 -0.587898 0.083683 3.013935 -0.182309 0.223917 0.865228 -1.782158 -2.474936 2.493582 0.315165 2.059288 0.683859 -0.485452 5.128350 1.720744 -1.488235 0
2 -0.874861 -0.640632 4.084202 -1.590454 0.525855 -1.957592 -0.695367 1.347309 -1.732348 0.466500 -4.928214 3.565070 -0.449329 -0.656246 -0.166537 -1.630207 2.291865 2.396492 0.601278 1.793534 -2.120238 0.481968 -0.840707 1.790197 1.874395 0.363930 -0.169063 -0.483832 -2.118982 -2.156586 2.907291 -1.318888 -2.997464 0.459664 0.619774 5.631504 1.323512 -1.752154 1.808302 1.675748 0
3 0.238384 1.458607 4.014528 2.534478 1.196987 -3.117330 -0.924035 0.269493 1.322436 0.702345 -5.578345 -0.850662 2.590525 0.767418 -2.390809 -2.341961 0.571875 -0.933751 0.508677 1.210715 -3.259524 0.104587 -0.658875 1.498107 1.100305 4.142988 -0.248446 -1.136516 -5.355810 -4.545931 3.808667 3.517918 -3.074085 -0.284220 0.954576 3.029331 -1.367198 -3.412140 0.906000 -2.450889 0
4 5.828225 2.768260 -1.234530 2.809264 -1.641648 -1.406698 0.568643 0.965043 1.918379 -2.774855 -0.530016 1.374544 -0.650941 -1.679466 -0.379220 -4.443143 3.893857 -0.607640 2.944931 0.367233 -5.789081 4.597528 4.450264 3.224941 0.396701 0.247765 -2.362047 1.079378 -0.473076 2.242810 -3.591421 1.773841 -1.501573 -2.226702 4.776830 -6.559698 -0.805551 -0.276007 -3.858207 -0.537694 0
In [ ]:
# View random 5 rows of the test data
data_test.sample(5)
Out[ ]:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40 Target
4315 2.618494 -3.191676 3.031255 -2.368510 -1.997619 -2.164036 -1.111450 1.190680 1.642055 0.551021 -2.705755 0.285446 5.310006 2.445836 -2.394685 -1.414572 -1.898786 0.392164 -1.680833 3.271001 -3.911604 1.030938 -0.265800 -4.920196 2.666716 -0.754164 4.576995 -2.437166 -3.512911 -2.676940 1.894731 -2.209975 -4.324685 -0.094115 0.615038 4.793602 0.799068 -7.198435 1.321070 -0.378571 0
3559 0.695100 0.168081 1.208282 -6.640823 2.218828 3.085847 0.587179 -2.679194 -1.564612 1.682963 0.508370 -1.099111 -0.858966 0.132783 1.710629 0.427648 -4.251250 3.145762 -5.092608 -5.837187 2.127793 -0.847501 -2.487390 -1.083902 1.038873 1.979315 4.793393 -2.279777 -0.398800 -3.572594 5.485736 -2.713247 1.722762 -2.856871 -3.623083 5.843807 0.084745 2.721640 2.252480 4.447061 0
3464 -0.024189 -3.604224 3.840381 -3.143159 -0.915704 -1.275399 -1.726067 0.780507 -0.436314 1.987590 -1.688781 0.269244 4.547511 1.966785 -2.734846 -0.058644 -4.033663 1.031868 -1.536072 0.913761 -2.922876 -0.326258 -1.600565 -2.739692 1.028948 0.316272 3.389576 -3.042895 -1.845041 -0.945287 4.514637 0.663702 -0.173915 -0.614979 1.913042 6.293275 0.008954 -5.611372 1.991419 -0.608109 0
311 -4.688899 0.286059 2.591574 3.176721 1.182458 -2.073267 -0.708966 -0.780850 -2.334516 2.952707 -5.329062 2.983852 -2.293846 -1.266000 -0.897745 -3.206876 2.664876 1.029436 5.650534 1.638511 -1.052298 0.396357 -1.012577 6.792212 -0.376734 0.347995 -2.421598 2.610988 -3.749525 -2.107858 0.123124 2.609404 -2.112966 4.397915 1.408204 3.269961 -0.330060 3.940974 2.523738 -3.774929 0
3172 -0.328996 0.711194 4.350392 0.367634 0.233979 -0.243263 -1.533171 -3.485223 1.448103 0.724604 -3.733459 -1.711292 0.549223 -1.135692 -2.159418 -5.950062 0.153368 1.796897 0.677420 0.828173 -3.497138 0.127790 -4.355305 0.011858 1.651989 2.378928 1.041839 -0.423864 -2.374576 -3.098095 0.734892 -2.953439 -1.054328 0.207185 0.649782 2.853349 -0.299467 0.829404 2.068049 -0.970966 0

Observation

  • All predictor variables (V1 to V40) are of floating-point data type.

3.3 - Check the data types of the columns¶

In [ ]:
# View the data types of the columns in the train data
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20000 entries, 0 to 19999
Data columns (total 41 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   V1      19982 non-null  float64
 1   V2      19982 non-null  float64
 2   V3      20000 non-null  float64
 3   V4      20000 non-null  float64
 4   V5      20000 non-null  float64
 5   V6      20000 non-null  float64
 6   V7      20000 non-null  float64
 7   V8      20000 non-null  float64
 8   V9      20000 non-null  float64
 9   V10     20000 non-null  float64
 10  V11     20000 non-null  float64
 11  V12     20000 non-null  float64
 12  V13     20000 non-null  float64
 13  V14     20000 non-null  float64
 14  V15     20000 non-null  float64
 15  V16     20000 non-null  float64
 16  V17     20000 non-null  float64
 17  V18     20000 non-null  float64
 18  V19     20000 non-null  float64
 19  V20     20000 non-null  float64
 20  V21     20000 non-null  float64
 21  V22     20000 non-null  float64
 22  V23     20000 non-null  float64
 23  V24     20000 non-null  float64
 24  V25     20000 non-null  float64
 25  V26     20000 non-null  float64
 26  V27     20000 non-null  float64
 27  V28     20000 non-null  float64
 28  V29     20000 non-null  float64
 29  V30     20000 non-null  float64
 30  V31     20000 non-null  float64
 31  V32     20000 non-null  float64
 32  V33     20000 non-null  float64
 33  V34     20000 non-null  float64
 34  V35     20000 non-null  float64
 35  V36     20000 non-null  float64
 36  V37     20000 non-null  float64
 37  V38     20000 non-null  float64
 38  V39     20000 non-null  float64
 39  V40     20000 non-null  float64
 40  Target  20000 non-null  int64  
dtypes: float64(40), int64(1)
memory usage: 6.3 MB
In [ ]:
# View the data types of the columns in the test data
data_test.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5000 entries, 0 to 4999
Data columns (total 41 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   V1      4995 non-null   float64
 1   V2      4994 non-null   float64
 2   V3      5000 non-null   float64
 3   V4      5000 non-null   float64
 4   V5      5000 non-null   float64
 5   V6      5000 non-null   float64
 6   V7      5000 non-null   float64
 7   V8      5000 non-null   float64
 8   V9      5000 non-null   float64
 9   V10     5000 non-null   float64
 10  V11     5000 non-null   float64
 11  V12     5000 non-null   float64
 12  V13     5000 non-null   float64
 13  V14     5000 non-null   float64
 14  V15     5000 non-null   float64
 15  V16     5000 non-null   float64
 16  V17     5000 non-null   float64
 17  V18     5000 non-null   float64
 18  V19     5000 non-null   float64
 19  V20     5000 non-null   float64
 20  V21     5000 non-null   float64
 21  V22     5000 non-null   float64
 22  V23     5000 non-null   float64
 23  V24     5000 non-null   float64
 24  V25     5000 non-null   float64
 25  V26     5000 non-null   float64
 26  V27     5000 non-null   float64
 27  V28     5000 non-null   float64
 28  V29     5000 non-null   float64
 29  V30     5000 non-null   float64
 30  V31     5000 non-null   float64
 31  V32     5000 non-null   float64
 32  V33     5000 non-null   float64
 33  V34     5000 non-null   float64
 34  V35     5000 non-null   float64
 35  V36     5000 non-null   float64
 36  V37     5000 non-null   float64
 37  V38     5000 non-null   float64
 38  V39     5000 non-null   float64
 39  V40     5000 non-null   float64
 40  Target  5000 non-null   int64  
dtypes: float64(40), int64(1)
memory usage: 1.6 MB

Convert the 'Target' column to float for compatibility with neural network operations.

In [ ]:
# Convert the 'Target' column to float in train set
data['Target'] = data['Target'].astype(float)
In [ ]:
# Convert the 'Target' column to float in test set
data_test['Target'] = data_test['Target'].astype(float)
In [ ]:
# Check the distinct categories in Target column of train data
print("Train set categories and value counts")
print("Number of unique categories:", data["Target"].nunique(),"\n")
print("Value counts for each category:\n", data["Target"].value_counts(), "\n")
print("Percentage of each category:\n", data["Target"].value_counts()/data["Target"].shape[0], "\n")

print("-" * 50)

# Check the distinct categories in Target column of test data
print("Test set categories and value counts")
print("Number of unique categories:", data_test["Target"].nunique(),"\n")
print("Value counts for each category:\n", data_test["Target"].value_counts(), "\n")
print("Percentage of each category:\n", data_test["Target"].value_counts()/data_test["Target"].shape[0], "\n")
Train set categories and value counts
Number of unique categories: 2 

Value counts for each category:
 Target
0.0    18890
1.0     1110
Name: count, dtype: int64 

Percentage of each category:
 Target
0.0    0.9445
1.0    0.0555
Name: count, dtype: float64 

--------------------------------------------------
Test set categories and value counts
Number of unique categories: 2 

Value counts for each category:
 Target
0.0    4718
1.0     282
Name: count, dtype: int64 

Percentage of each category:
 Target
0.0    0.9436
1.0    0.0564
Name: count, dtype: float64 

Observation

  • As observed earlier, predictor variables (V1 to V40) are of floating-point data type.
  • The Target column was of integer type which was converted to floating type.
  • Missing values are present in columns V1 and V2 in both the train and test datasets.
  • Percentage of each category in 'Target' column shows data imbalance. Approximately 94.45% of the cases represent "No failure" (Target = 0), while about 5.5% represent "Failure" (Target = 1).

3.4 - Check for duplicate values¶

In [ ]:
# Check for duplicates in the train data
data.duplicated().sum()
Out[ ]:
0
In [ ]:
# Check for duplicates in the test data
data_test.duplicated().sum()
Out[ ]:
0

Observation

  • No duplicates in train set
  • No duplicates in test set

3.5 - Check for missing values¶

In [ ]:
# Check for missing values in the train data
data.isnull().sum()
Out[ ]:
0
V1 18
V2 18
V3 0
V4 0
V5 0
V6 0
V7 0
V8 0
V9 0
V10 0
V11 0
V12 0
V13 0
V14 0
V15 0
V16 0
V17 0
V18 0
V19 0
V20 0
V21 0
V22 0
V23 0
V24 0
V25 0
V26 0
V27 0
V28 0
V29 0
V30 0
V31 0
V32 0
V33 0
V34 0
V35 0
V36 0
V37 0
V38 0
V39 0
V40 0
Target 0

In [ ]:
# Check for missing values in the test data
data_test.isnull().sum()
Out[ ]:
0
V1 5
V2 6
V3 0
V4 0
V5 0
V6 0
V7 0
V8 0
V9 0
V10 0
V11 0
V12 0
V13 0
V14 0
V15 0
V16 0
V17 0
V18 0
V19 0
V20 0
V21 0
V22 0
V23 0
V24 0
V25 0
V26 0
V27 0
V28 0
V29 0
V30 0
V31 0
V32 0
V33 0
V34 0
V35 0
V36 0
V37 0
V38 0
V39 0
V40 0
Target 0

Observation

  • As observed earlier, missing values are present in columns V1 and V2 in both the train and test datasets. This needs imputation.

3.6 - Describe dataset¶

In [ ]:
# View the statistical summary of the numerical columns in the train data
data.describe().T
Out[ ]:
count mean std min 25% 50% 75% max
V1 19982.0 -0.271996 3.441625 -11.876451 -2.737146 -0.747917 1.840112 15.493002
V2 19982.0 0.440430 3.150784 -12.319951 -1.640674 0.471536 2.543967 13.089269
V3 20000.0 2.484699 3.388963 -10.708139 0.206860 2.255786 4.566165 17.090919
V4 20000.0 -0.083152 3.431595 -15.082052 -2.347660 -0.135241 2.130615 13.236381
V5 20000.0 -0.053752 2.104801 -8.603361 -1.535607 -0.101952 1.340480 8.133797
V6 20000.0 -0.995443 2.040970 -10.227147 -2.347238 -1.000515 0.380330 6.975847
V7 20000.0 -0.879325 1.761626 -7.949681 -2.030926 -0.917179 0.223695 8.006091
V8 20000.0 -0.548195 3.295756 -15.657561 -2.642665 -0.389085 1.722965 11.679495
V9 20000.0 -0.016808 2.160568 -8.596313 -1.494973 -0.067597 1.409203 8.137580
V10 20000.0 -0.012998 2.193201 -9.853957 -1.411212 0.100973 1.477045 8.108472
V11 20000.0 -1.895393 3.124322 -14.832058 -3.922404 -1.921237 0.118906 11.826433
V12 20000.0 1.604825 2.930454 -12.948007 -0.396514 1.507841 3.571454 15.080698
V13 20000.0 1.580486 2.874658 -13.228247 -0.223545 1.637185 3.459886 15.419616
V14 20000.0 -0.950632 1.789651 -7.738593 -2.170741 -0.957163 0.270677 5.670664
V15 20000.0 -2.414993 3.354974 -16.416606 -4.415322 -2.382617 -0.359052 12.246455
V16 20000.0 -2.925225 4.221717 -20.374158 -5.634240 -2.682705 -0.095046 13.583212
V17 20000.0 -0.134261 3.345462 -14.091184 -2.215611 -0.014580 2.068751 16.756432
V18 20000.0 1.189347 2.592276 -11.643994 -0.403917 0.883398 2.571770 13.179863
V19 20000.0 1.181808 3.396925 -13.491784 -1.050168 1.279061 3.493299 13.237742
V20 20000.0 0.023608 3.669477 -13.922659 -2.432953 0.033415 2.512372 16.052339
V21 20000.0 -3.611252 3.567690 -17.956231 -5.930360 -3.532888 -1.265884 13.840473
V22 20000.0 0.951835 1.651547 -10.122095 -0.118127 0.974687 2.025594 7.409856
V23 20000.0 -0.366116 4.031860 -14.866128 -3.098756 -0.262093 2.451750 14.458734
V24 20000.0 1.134389 3.912069 -16.387147 -1.468062 0.969048 3.545975 17.163291
V25 20000.0 -0.002186 2.016740 -8.228266 -1.365178 0.025050 1.397112 8.223389
V26 20000.0 1.873785 3.435137 -11.834271 -0.337863 1.950531 4.130037 16.836410
V27 20000.0 -0.612413 4.368847 -14.904939 -3.652323 -0.884894 2.189177 17.560404
V28 20000.0 -0.883218 1.917713 -9.269489 -2.171218 -0.891073 0.375884 6.527643
V29 20000.0 -0.985625 2.684365 -12.579469 -2.787443 -1.176181 0.629773 10.722055
V30 20000.0 -0.015534 3.005258 -14.796047 -1.867114 0.184346 2.036229 12.505812
V31 20000.0 0.486842 3.461384 -13.722760 -1.817772 0.490304 2.730688 17.255090
V32 20000.0 0.303799 5.500400 -19.876502 -3.420469 0.052073 3.761722 23.633187
V33 20000.0 0.049825 3.575285 -16.898353 -2.242857 -0.066249 2.255134 16.692486
V34 20000.0 -0.462702 3.183841 -17.985094 -2.136984 -0.255008 1.436935 14.358213
V35 20000.0 2.229620 2.937102 -15.349803 0.336191 2.098633 4.064358 15.291065
V36 20000.0 1.514809 3.800860 -14.833178 -0.943809 1.566526 3.983939 19.329576
V37 20000.0 0.011316 1.788165 -5.478350 -1.255819 -0.128435 1.175533 7.467006
V38 20000.0 -0.344025 3.948147 -17.375002 -2.987638 -0.316849 2.279399 15.289923
V39 20000.0 0.890653 1.753054 -6.438880 -0.272250 0.919261 2.057540 7.759877
V40 20000.0 -0.875630 3.012155 -11.023935 -2.940193 -0.920806 1.119897 10.654265
Target 20000.0 0.055500 0.228959 0.000000 0.000000 0.000000 0.000000 1.000000
In [ ]:
# View the statistical summary of the numerical columns in the test data
data_test.describe().T
Out[ ]:
count mean std min 25% 50% 75% max
V1 4995.0 -0.277622 3.466280 -12.381696 -2.743691 -0.764767 1.831313 13.504352
V2 4994.0 0.397928 3.139562 -10.716179 -1.649211 0.427369 2.444486 14.079073
V3 5000.0 2.551787 3.326607 -9.237940 0.314931 2.260428 4.587000 15.314503
V4 5000.0 -0.048943 3.413937 -14.682446 -2.292694 -0.145753 2.166468 12.140157
V5 5000.0 -0.080120 2.110870 -7.711569 -1.615238 -0.131890 1.341197 7.672835
V6 5000.0 -1.042138 2.005444 -8.924196 -2.368853 -1.048571 0.307555 5.067685
V7 5000.0 -0.907922 1.769017 -8.124230 -2.054259 -0.939695 0.212228 7.616182
V8 5000.0 -0.574592 3.331911 -12.252731 -2.642088 -0.357943 1.712896 10.414722
V9 5000.0 0.030121 2.174139 -6.785495 -1.455712 -0.079891 1.449548 8.850720
V10 5000.0 0.018524 2.145437 -8.170956 -1.353320 0.166292 1.511248 6.598728
V11 5000.0 -2.008615 3.112220 -13.151753 -4.050432 -2.043122 0.044069 9.956400
V12 5000.0 1.576413 2.907401 -8.164048 -0.449674 1.488253 3.562626 12.983644
V13 5000.0 1.622456 2.882892 -11.548209 -0.126012 1.718649 3.464604 12.620041
V14 5000.0 -0.921097 1.803470 -7.813929 -2.110952 -0.896011 0.272324 5.734112
V15 5000.0 -2.452174 3.387041 -15.285768 -4.479072 -2.417131 -0.432943 11.673420
V16 5000.0 -3.018503 4.264407 -20.985779 -5.648343 -2.773763 -0.178105 13.975843
V17 5000.0 -0.103721 3.336513 -13.418281 -2.227683 0.047462 2.111907 19.776592
V18 5000.0 1.195606 2.586403 -12.214016 -0.408850 0.881395 2.604014 13.642235
V19 5000.0 1.210490 3.384662 -14.169635 -1.026394 1.295864 3.526278 12.427997
V20 5000.0 0.138429 3.657171 -13.719620 -2.325454 0.193386 2.539550 13.870565
V21 5000.0 -3.664398 3.577841 -16.340707 -5.944369 -3.662870 -1.329645 11.046925
V22 5000.0 0.961960 1.640414 -6.740239 -0.047728 0.986020 2.029321 7.505291
V23 5000.0 -0.422182 4.056714 -14.422274 -3.162690 -0.279222 2.425911 13.180887
V24 5000.0 1.088841 3.968207 -12.315545 -1.623203 0.912815 3.537195 17.806035
V25 5000.0 0.061235 2.010227 -6.770139 -1.298377 0.076703 1.428491 6.556937
V26 5000.0 1.847261 3.400330 -11.414019 -0.242470 1.917032 4.156106 17.528193
V27 5000.0 -0.552397 4.402947 -13.177038 -3.662591 -0.871982 2.247257 17.290161
V28 5000.0 -0.867678 1.926181 -7.933388 -2.159811 -0.930695 0.420587 7.415659
V29 5000.0 -1.095805 2.655454 -9.987800 -2.861373 -1.340547 0.521843 14.039466
V30 5000.0 -0.118699 3.023292 -12.438434 -1.996743 0.112463 1.946450 10.314976
V31 5000.0 0.468810 3.446324 -11.263271 -1.822421 0.485742 2.779008 12.558928
V32 5000.0 0.232567 5.585628 -17.244168 -3.556267 -0.076694 3.751857 26.539391
V33 5000.0 -0.080115 3.538624 -14.903781 -2.348121 -0.159713 2.099160 13.323517
V34 5000.0 -0.392663 3.166101 -14.699725 -2.009604 -0.171745 1.465402 12.146302
V35 5000.0 2.211205 2.948426 -12.260591 0.321818 2.111750 4.031639 13.489237
V36 5000.0 1.594845 3.774970 -12.735567 -0.866066 1.702964 4.104409 17.116122
V37 5000.0 0.022931 1.785320 -5.079070 -1.240526 -0.110415 1.237522 6.809938
V38 5000.0 -0.405659 3.968936 -15.334533 -2.984480 -0.381162 2.287998 13.064950
V39 5000.0 0.938800 1.716502 -5.451050 -0.208024 0.959152 2.130769 7.182237
V40 5000.0 -0.932406 2.978193 -10.076234 -2.986587 -1.002764 1.079738 8.698460
Target 5000.0 0.056400 0.230716 0.000000 0.000000 0.000000 0.000000 1.000000

Observation

  • The statistical summary shows a wide range of values and varying standard deviations across the predictor variables (V1-V40).
  • Neural networks are sensitive to the scale of input features. Standardizing or normalizing the predictors is necessary to ensure that all features contribute equally during training and to improve model convergence and performance.

4 - Exploratory Data Analysis (EDA)¶

4.1 - Define utility functions for Univariate Analysis¶

Since all the columns are numerical, a utility function is built using histogram and box plots.

In [ ]:
# Function to plot histogram and box plot

def histogram_boxplot(data, feature, figsize=(15, 10), kde=False, bins=None, title=None):
    """
    Boxplot and histogram combined

    data: dataframe
    feature: dataframe column
    figsize: size of figure (default (12,7))
    kde: whether to show density curve (default False)
    bins: number of bins for histogram (default None)
    title: title of plot (default None)
    """
    f2, (ax_box2, ax_hist2) = plt.subplots(
        nrows=2,  # Number of rows of the subplot grid= 2
        sharex=True,  # x-axis will be shared among all subplots
        gridspec_kw={"height_ratios": (0.25, 0.75)},
        figsize=figsize,
    )

    # Add title to the figure if provided
    if title:
        f2.suptitle(title, y=0.9) # Use suptitle for a title over all subplots

    # creating the 2 subplots
    # boxplot will be created and a triangle will indicate the mean value of the column
    sns.boxplot(
        data= data, x= feature, ax= ax_box2, showmeans=True, color="turquoise"
    )

    # For histogram
    sns.histplot(
        data=data, x=feature, kde=kde, ax=ax_hist2, bins=bins, palette="winter"
    ) if bins else sns.histplot(
        data=data, x=feature, kde=kde, ax=ax_hist2
    )

    # Add mean to the histogram
    ax_hist2.axvline(
        data[feature].mean(), color="green", linestyle="--", label = 'Mean'
    )
    # Add median to the histogram
    ax_hist2.axvline(
        data[feature].median(), color="orange", linestyle="-", label = 'Median'
    )
    ax_hist2.legend()

4.2 - Univariate analysis¶

4.2.1 - histogram_boxplot() on predictors V1 to V40¶

In [ ]:
for feature in data.columns.to_list()[:-1]:
    histogram_boxplot(data, feature, kde=True, bins=None, title=f"Distribution of feature {feature}")
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

Observation

  • The distributions of features V1 to V40 vary, with some exhibiting symmetry and others showing skewness.
  • The box plots reveal the presence of outliers in almost all the features
  • While the direct meaning of these ciphered features is unknown, their varying scales (also seen from describe() function) are important considerations for subsequent data preprocessing steps like normalization/standardization

4.2.2 Check the distrubution of Target variable¶

In [ ]:
# For train data
data["Target"].value_counts(1)
Out[ ]:
proportion
Target
0.0 0.9445
1.0 0.0555

In [ ]:
# display the proportion of the target variable in the test data
data_test["Target"].value_counts(1)
Out[ ]:
proportion
Target
0.0 0.9436
1.0 0.0564

PCA is performed for visualization only

In [ ]:
# Find the index of column V40
data.columns.get_loc('V40')
Out[ ]:
39
In [ ]:
# Visualize data imbalance using PCA
pca = PCA(n_components=2)

# Columns 'V1' and 'V2' have missing values, which will cause an error in fitting.
# For visualization purposes only, we will use column indices from 2 to 39.
# Imputation will be performed after splitting the data.
features_2d = pca.fit_transform(data.iloc[:,2:39])
print("type(features_2d) = ", type(features_2d))

data_2d= pd.DataFrame(features_2d)
data_2d= pd.concat([data_2d, data['Target']], axis=1)
data_2d.columns= ['x', 'y', 'Target']
sns.lmplot(x='x', y='y', data=data_2d, fit_reg=False, hue='Target')
plt.title('PCA of Train Data by Target Class') # Add title here
plt.show() # Ensure the plot is displayed after adding the title
type(features_2d) =  <class 'numpy.ndarray'>
No description has been provided for this image

Observation

  • Class imbalance is already observed in section 3.3
  • Approximately 94.45% of the cases represent "No failure" (Target = 0), while about 5.5% represent "Failure" (Target = 1).
  • PCA gives a better visualization of the imbalance in the train dataset

4.3 - Bivariate Analysis¶

4.3.1 - Correlation Check¶

In [ ]:
# Removing the target column is not necessary to see correlation
# all columns have numerical values
cols_list = data.columns.tolist()

plt.figure(figsize=(20, 20))
sns.heatmap(
    data[cols_list].corr(), annot=True, vmin=-1, vmax=1, fmt=".2f", cmap="Spectral"
)
plt.title('Correlation Heatmap of Features and Target')
plt.show()
No description has been provided for this image

Observation

  • There are some highly positively correlated features with each other (e.g., V15 and V7 have a correlation of 0.87) and some negatively correlated features with each other (e.g., V14 and V2 have a correlation of -0.85).
  • The correlation of any particular feature with the Target variable is relatively low, indicating no strong linear relationships between individual features and the target.

5 - Data Preprocessing¶

5.1 - Data Preparation for Modeling¶

In [ ]:
# Divide train data into X and y

# Drop the 'Target' column
X = data.drop(columns = ["Target"] , axis=1)

# Column named 'Target' becomes y
y = data["Target"]
In [ ]:
# Split data into training and validation set with 20% of the data allocated to validation set
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=1, stratify=y)
In [ ]:
# Check the shape of X_train and y_train
print(X_train.shape)
print(y_train.shape)
(16000, 40)
(16000,)
In [ ]:
# Check the shape of X_val and y_val
print(X_val.shape)
print(y_val.shape)
(4000, 40)
(4000,)
In [ ]:
# Divide test data into X_test and y_test

# Drop the 'Target' column for X
X_test = data_test.drop(columns = ['Target'] , axis= 1)

# Retain only 'Target' column for y
y_test = data_test["Target"]
In [ ]:
# Check the shape of X_test and y_test
print(X_test.shape)
print(y_test.shape)
(5000, 40)
(5000,)
In [ ]:
# Check the percentage of classes in target column in training, validation and test set
print("Percentage of classes in Target column in train set : ", y_train.value_counts(1))
print("Percentage of classes in Target column in validation set : ", y_val.value_counts(1))
print("Percentage of classes in Target column in test set : ", y_test.value_counts(1))
Percentage of classes in Target column in train set :  Target
0.0    0.9445
1.0    0.0555
Name: proportion, dtype: float64
Percentage of classes in Target column in validation set :  Target
0.0    0.9445
1.0    0.0555
Name: proportion, dtype: float64
Percentage of classes in Target column in test set :  Target
0.0    0.9436
1.0    0.0564
Name: proportion, dtype: float64

5.2 - Missing Value Imputation and Scaling¶

In [ ]:
imputer = SimpleImputer(strategy="median")
In [ ]:
# Fit and transform the train data - calculates the median from the training data and imputes missing values
X_train = pd.DataFrame(imputer.fit_transform(X_train), columns=X_train.columns)

# Transform the validation data using the median calculated from the training data
X_val = pd.DataFrame(imputer.transform(X_val), columns=X_train.columns)

# Transform the test data using the median calculated from the training data
X_test = pd.DataFrame(imputer.transform(X_test), columns=X_train.columns)
In [ ]:
# Verify that no data set has any values missing
print("Missing values in X_train : ", X_train.isnull().sum())
print("-" * 30)
print("Missing values in X_val : ", X_val.isnull().sum())
print("-" * 30)
print("Missing values in X_test : ", X_test.isnull().sum())
Missing values in X_train :  V1     0
V2     0
V3     0
V4     0
V5     0
V6     0
V7     0
V8     0
V9     0
V10    0
V11    0
V12    0
V13    0
V14    0
V15    0
V16    0
V17    0
V18    0
V19    0
V20    0
V21    0
V22    0
V23    0
V24    0
V25    0
V26    0
V27    0
V28    0
V29    0
V30    0
V31    0
V32    0
V33    0
V34    0
V35    0
V36    0
V37    0
V38    0
V39    0
V40    0
dtype: int64
------------------------------
Missing values in X_val :  V1     0
V2     0
V3     0
V4     0
V5     0
V6     0
V7     0
V8     0
V9     0
V10    0
V11    0
V12    0
V13    0
V14    0
V15    0
V16    0
V17    0
V18    0
V19    0
V20    0
V21    0
V22    0
V23    0
V24    0
V25    0
V26    0
V27    0
V28    0
V29    0
V30    0
V31    0
V32    0
V33    0
V34    0
V35    0
V36    0
V37    0
V38    0
V39    0
V40    0
dtype: int64
------------------------------
Missing values in X_test :  V1     0
V2     0
V3     0
V4     0
V5     0
V6     0
V7     0
V8     0
V9     0
V10    0
V11    0
V12    0
V13    0
V14    0
V15    0
V16    0
V17    0
V18    0
V19    0
V20    0
V21    0
V22    0
V23    0
V24    0
V25    0
V26    0
V27    0
V28    0
V29    0
V30    0
V31    0
V32    0
V33    0
V34    0
V35    0
V36    0
V37    0
V38    0
V39    0
V40    0
dtype: int64
In [ ]:
# Backup the column names
column_names = X_train.columns.to_list()
In [ ]:
# Standardizing all the columns as they have different ranges.
# This is to ensure that all features contribute equally, training becomes more stable, and the network can learn faster

# Initialize StandardScaler
scaler = StandardScaler()
In [ ]:
# Fit the scaler on the training data and transform it
# This calculates the mean and standard deviation from the training data and applies the transformation
X_train = scaler.fit_transform(X_train)

# Transform the validation data using the scaler fitted on the training data
# This ensures that the validation data is scaled using the same parameters as the training data, preventing data leakage
X_val = scaler.transform(X_val)

# Transform the test data using the scaler fitted on the training data
# This ensures consistency in scaling across all datasets
X_test = scaler.transform(X_test)
In [ ]:
print("type(X_train) : ", type(X_train))
print("type(X_val) : ", type(X_val))
print("type(X_test) : ", type(X_test))
type(X_train) :  <class 'numpy.ndarray'>
type(X_val) :  <class 'numpy.ndarray'>
type(X_test) :  <class 'numpy.ndarray'>
In [ ]:
print("X_train shape : ", X_train.shape)
print("X_val shape : ", X_val.shape)
print("X_test shape : ", X_test.shape)
X_train shape :  (16000, 40)
X_val shape :  (4000, 40)
X_test shape :  (5000, 40)
In [ ]:
# Convert target variables to NumPy arrays for compatibility with neural network libraries
y_train = y_train.to_numpy()
y_val = y_val.to_numpy()
y_test = y_test.to_numpy()
In [ ]:
print("y_train shape : ", y_train.shape)
print("y_val shape : ", y_val.shape)
print("y_test shape : ", y_test.shape)
y_train shape :  (16000,)
y_val shape :  (4000,)
y_test shape :  (5000,)
In [ ]:
print("type(X_train) : ", type(X_train), "type(y_train) = ", type(y_train))
print("type(X_val) : ", type(X_val), "type(y_val) = ", type(y_val))
print("type(X_test) : ", type(X_test), "type(y_test) = ", type(y_test))
type(X_train) :  <class 'numpy.ndarray'> type(y_train) =  <class 'numpy.ndarray'>
type(X_val) :  <class 'numpy.ndarray'> type(y_val) =  <class 'numpy.ndarray'>
type(X_test) :  <class 'numpy.ndarray'> type(y_test) =  <class 'numpy.ndarray'>

Observation

  • The dataset contains only numerical features (V1-V40) and a binary target variable, so categorical encoding is not required.
  • Missing values in features V1 and V2 in train, validation and test sets are imputed.
  • The predictor variables have a wide range of values, necessitating normalization or standardization for optimal neural network performance.
  • A separate test set is provided for final model evaluation.
  • Imputation and scaling is performed after splitting the data to prevent data leakage.
  • The target variable exhibits class imbalance, which should be addressed during model training (e.g., using class weights or other techniques).

6 - Model Building and Performance Improvement¶

6.1 - Model Evaluation Criterion¶

In this problem, we aim to predict generator failures, where:

  • Class 1 represents "Failure"
  • Class 0 represents "No failure"

When evaluating a classification model, we consider the following outcomes:

  • True Positives (TP): The model correctly predicts a "Failure" (Class 1) when the actual class is "Failure" (Class 1). These result in repair costs.
  • True Negatives (TN): The model correctly predicts "No failure" (Class 0) when the actual class is "No failure" (Class 0).
  • False Negatives (FN): The model incorrectly predicts "No failure" (Class 0) when the actual class is "Failure" (Class 1). These are costly as they lead to generator replacement.
  • False Positives (FP): The model incorrectly predicts "Failure" (Class 1) when the actual class is "No failure" (Class 0). These result in inspection costs.

Based on the problem description, the cost of a False Negative (replacement cost) is significantly higher than the cost of a False Positive (inspection cost), and the cost of a True Positive (repair cost) is less than the replacement cost.

Therefore, minimizing False Negatives is crucial. This translates to prioritizing a model that has a high Recall for the positive class (Failure). Recall measures the model's ability to identify all actual positive instances. A higher recall means the model is better at detecting failures, thus reducing the number of missed failures (False Negatives) and the associated high replacement costs.

6.2 - Define utility Functions¶

  • A function required to plot loss and recall
  • A function to derive metrics
  • A function to visualize confusion matrix
In [ ]:
# Plots loss and a specified metric against epochs
def plot(history, name, model_config):
    """
    Function to plot loss/accuracy

    history: an object which stores the metrics and losses.
    name: can be one of Loss or Accuracy
    model_config: name of the model
    """
    #Creating a subplot with figure and axes.
    fig, ax = plt.subplots()
    #Plotting the train accuracy or train loss
    plt.plot(history.history[name])
    #Plotting the validation accuracy or validation loss
    plt.plot(history.history['val_'+name])

    #Defining the title of the plot
    ax.set_title(f"Model - {name.capitalize()}\n{model_config}", fontsize=10)
    #Capitalizing the first letter
    plt.ylabel(name.capitalize())
    #Defining the label for the x-axis
    plt.xlabel('Epoch')
    #Defining the legend, loc controls the position of the legend.
    fig.legend(['Train', 'Validation'], loc="outside lower right")
In [ ]:
# Define a function to compute different metrics to check performance of a classification model built using statsmodels
def model_performance_classification(
    model, predictors, target, threshold=0.5
):
    """
    Function to compute different metrics to check classification model performance

    model: classifier
    predictors: independent variables
    target: dependent variable
    threshold: threshold for classifying the observation as class 1
    """

    # checking which probabilities are greater than threshold
    pred = model.predict(predictors) > threshold
    #print("type(pred) : ", type(pred), "Sample values : ", pred[0:5, :])
    #print("type(target) : ", type(target), "Sample values : ", target[0:5])

    acc = accuracy_score(target, pred)  # Compute Accuracy
    recall = recall_score(target, pred, average='macro')  # Compute Recall
    precision = precision_score(target, pred, average='macro')  # Compute Precision
    f1 = f1_score(target, pred, average='macro')  # Compute F1-score

    # Creating a dataframe of metrics
    df_perf = pd.DataFrame(
        {"Accuracy": acc, "Recall": recall, "Precision": precision, "F1 Score": f1,}, index = [0]
    )

    return df_perf
In [ ]:
# Define function to plot sklearn confusion matrix using seaborn heatmap visalisation
def make_confusion_matrix(cf,
                          group_names=['True Negative','False Positive','False Negative','True Positive'],
                          categories='auto',
                          count=True,
                          percent=True,
                          cbar=True,
                          xyticks=True,
                          xyplotlabels=True,
                          sum_stats=True,
                          figsize=None,
                          cmap='Blues',
                          title=None):
    '''
    This function will make a pretty plot of an sklearn Confusion Matrix cm using a Seaborn heatmap visualization.
    Arguments
    ---------
    cf:            confusion matrix to be passed in
    group_names:   List of strings that represent the labels row by row to be shown in each square.
    categories:    List of strings containing the categories to be displayed on the x,y axis.
    count:         If True, show the raw number in the confusion matrix.
    normalize:     If True, show the proportions for each category.
    cbar:          If True, show the color bar.
    xyticks:       If True, show x and y ticks.
    xyplotlabels:  If True, show 'True Label' and 'Predicted Label' on the figure.
    sum_stats:     If True, display summary statistics below the figure.
    figsize:       Tuple representing the figure size.
    cmap:          Colormap of the values displayed from matplotlib.
    title:         Title for the heatmap.
    '''

    print(cf)

    # Code to generate text inside each sqaure
    blanks = ['' for i in range(cf.size)]

    if group_names and len(group_names)==cf.size:
        group_labels = ["{}\n".format(value) for value in group_names]
    else:
        group_labels = blanks

    if count:
        group_counts = ["{0:0.0f}\n".format(value) for value in cf.flatten()]
    else:
        group_counts = blanks

    if percent:
        group_percentages = ["{0:.2%}".format(value) for value in cf.flatten()/np.sum(cf)]
    else:
        group_percentages = blanks

    box_labels = [f"{v1}{v2}{v3}".strip() for v1, v2, v3 in zip(group_labels,group_counts,group_percentages)]
    box_labels = np.asarray(box_labels).reshape(cf.shape[0],cf.shape[1])


    # Code to generate summary statistics and text for summary stats
    if sum_stats:
        # Accuracy is sum of diagonal divided by total observations
        accuracy  = np.trace(cf) / float(np.sum(cf))

        # Show some more stats for binary confusion matrix
        if len(cf)==2:
            # Metrics for Binary Confusion Matrices
            precision = cf[1,1] / sum(cf[:,1])
            recall    = cf[1,1] / sum(cf[1,:])
            f1_score  = 2*precision*recall / (precision + recall)
            stats_text = "\n\nAccuracy={:0.3f}\nPrecision={:0.3f}\nRecall={:0.3f}\nF1 Score={:0.3f}".format(
                accuracy,precision,recall,f1_score)
        else:
            stats_text = "\n\nAccuracy={:0.3f}".format(accuracy)
    else:
        stats_text = ""

    # Set figure parameters according to other arguments
    if figsize==None:
        # Get default figure size if not set
        figsize = plt.rcParams.get('figure.figsize')

    if xyticks==False:
        # Do not show categories if xyticks is False
        categories=False

    # Plot heatmap visualisation
    plt.figure(figsize=figsize)
    sns.heatmap(cf,annot=box_labels,fmt="",cmap=cmap,cbar=cbar,xticklabels=categories,yticklabels=categories)

    if xyplotlabels:
        plt.ylabel('True label')
        plt.xlabel('Predicted label' + stats_text)
    else:
        plt.xlabel(stats_text)

    if title:
        plt.title(title, fontsize=10)
In [ ]:
# Since the class is imabalnced in the data, we'll use class weights in all the models
# Calculate class weights for imbalanced dataset. This gives higher weight to the minority class.
cw = (y_train.shape[0]) / np.bincount(y_train.astype(int))

# Create a dictionary mapping class indices to their respective class weights
cw_dict = {}
for i in range(cw.shape[0]):
    cw_dict[i] = cw[i]

cw_dict
Out[ ]:
{0: 1.0587612493382743, 1: 18.01801801801802}

Since the class is heavily imabalnced in the data, we'll use class weights in all the models

6.3 - Model 0 - Baseline Model¶

  • Model Configuration
    • Model Architecture: Input Layer (40 neurons, implicitly defined by input data), 1 Hidden Layer (7 neurons), Output Layer (1 neuron).
    • Activation Functions: ReLU for the hidden layer, Sigmoid for the output layer (for binary classification probability).
    • Optimizer: Stochastic Gradient Descent (SGD).
    • Loss Function: Binary Crossentropy (standard for binary classification).
    • Class Imbalance Handling: Utilizing class weights to address the skewed distribution of the target variable.
    • Training Parameters:
      • Epochs: 25
      • Batch Size: 32
In [ ]:
# Define model configuration string to use in the title of graphs
model_0_config = "(1HL,7N,relu \n sgd \n bs32, 25E) "
In [ ]:
# Clear the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [ ]:
# Initializing the sequential model
model_0 = Sequential()

# Add the first hidden layer with 7 neurons, ReLU activation, and input dimension matching the number of features in X_train
model_0.add(Dense(7, activation="relu",input_dim=X_train.shape[1]))

# Add the output layer with 1 neuron and sigmoid activation for binary classification probability output
model_0.add(Dense(1, activation="sigmoid"))
In [ ]:
model_0.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 7)              │           287 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 1)              │             8 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 295 (1.15 KB)
 Trainable params: 295 (1.15 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Define SGD as the optimizer to be used
optimizer = tf.keras.optimizers.SGD()

# Recall is the chosen metric to measure
model_0.compile(loss='binary_crossentropy', optimizer=optimizer, metrics = ['Recall'])
In [ ]:
start = time.time()
history = model_0.fit(X_train, y_train, validation_data=(X_val,y_val), batch_size=32, epochs=25, class_weight=cw_dict)
end=time.time()
Epoch 1/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.7680 - loss: 0.9952 - val_Recall: 0.8739 - val_loss: 0.3598
Epoch 2/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8993 - loss: 0.5882 - val_Recall: 0.8784 - val_loss: 0.3195
Epoch 3/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9010 - loss: 0.5407 - val_Recall: 0.8784 - val_loss: 0.2924
Epoch 4/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9082 - loss: 0.5091 - val_Recall: 0.8784 - val_loss: 0.2752
Epoch 5/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9045 - loss: 0.4882 - val_Recall: 0.8784 - val_loss: 0.2628
Epoch 6/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8960 - loss: 0.4735 - val_Recall: 0.8784 - val_loss: 0.2507
Epoch 7/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8995 - loss: 0.4573 - val_Recall: 0.8784 - val_loss: 0.2418
Epoch 8/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8947 - loss: 0.4461 - val_Recall: 0.8784 - val_loss: 0.2335
Epoch 9/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8970 - loss: 0.4395 - val_Recall: 0.8829 - val_loss: 0.2271
Epoch 10/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8967 - loss: 0.4342 - val_Recall: 0.8784 - val_loss: 0.2245
Epoch 11/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8963 - loss: 0.4304 - val_Recall: 0.8829 - val_loss: 0.2228
Epoch 12/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8983 - loss: 0.4254 - val_Recall: 0.8874 - val_loss: 0.2209
Epoch 13/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8995 - loss: 0.4216 - val_Recall: 0.8874 - val_loss: 0.2143
Epoch 14/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8987 - loss: 0.4174 - val_Recall: 0.8874 - val_loss: 0.2112
Epoch 15/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8994 - loss: 0.4132 - val_Recall: 0.8874 - val_loss: 0.2093
Epoch 16/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9013 - loss: 0.4104 - val_Recall: 0.8874 - val_loss: 0.2079
Epoch 17/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9007 - loss: 0.4079 - val_Recall: 0.8919 - val_loss: 0.2063
Epoch 18/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9014 - loss: 0.4057 - val_Recall: 0.8919 - val_loss: 0.2058
Epoch 19/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9014 - loss: 0.4046 - val_Recall: 0.8919 - val_loss: 0.2028
Epoch 20/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9014 - loss: 0.4019 - val_Recall: 0.8919 - val_loss: 0.2030
Epoch 21/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8998 - loss: 0.4013 - val_Recall: 0.8919 - val_loss: 0.2017
Epoch 22/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9010 - loss: 0.3999 - val_Recall: 0.8874 - val_loss: 0.2005
Epoch 23/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9010 - loss: 0.3986 - val_Recall: 0.8829 - val_loss: 0.1985
Epoch 24/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9022 - loss: 0.3960 - val_Recall: 0.8829 - val_loss: 0.1980
Epoch 25/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9045 - loss: 0.3950 - val_Recall: 0.8829 - val_loss: 0.1940
In [ ]:
print("Time taken in seconds ",end - start)
Time taken in seconds  28.988130807876587
In [ ]:
plot(history,'loss', model_0_config)
No description has been provided for this image
In [ ]:
plot(history,'Recall', model_0_config)
No description has been provided for this image
In [ ]:
model_0_train_perf = model_performance_classification(model_0, X_train, y_train)
model_0_train_perf
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 996us/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.945438 0.925008 0.749345 0.808852
In [ ]:
model_0_val_perf = model_performance_classification(model_0, X_val, y_val)
model_0_val_perf
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step  
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.9505 0.918678 0.762725 0.818843
In [ ]:
# Evaluate model performance on the test set and visualize the confusion matrix.
y_test_pred = model_0.predict(X_test)
# y_test_pred[0:10, :]
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for Model 0 \n" + model_0_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step  
[[4512  206]
 [  41  241]]
No description has been provided for this image

Observation

  • Baseline model
    • The loss function shows a steep decrease in the initial epochs and then stabilizes, indicating convergence.
    • Recall on the training set is approximately 92.50%, while on the validation set it's around 91.87%. The test set recall is 85.5%.
    • The number of False Negatives on the test set is 41.
    • The drop in recall from training/validation to the test set suggests a potential generalization gap.

6.4 - Model 1 - Increasing Epochs¶

  • Model Configuration
    • Update epochs to 50 on Model 0 configuration
In [ ]:
# Define model configuration string to use in the title of graphs
model_1_config = "(1HL,7N,relu \n sgd \n bs32, 50E) "
In [ ]:
# Clear the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [ ]:
# Initializing the sequential model
model_1 = Sequential()

# Add the first hidden layer with 7 neurons, ReLU activation, and input dimension matching the number of features in X_train
model_1.add(Dense(7, activation="relu",input_dim=X_train.shape[1]))

# Add the output layer with 1 neuron and sigmoid activation for binary classification probability output
model_1.add(Dense(1, activation="sigmoid"))
In [ ]:
model_1.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 7)              │           287 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 1)              │             8 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 295 (1.15 KB)
 Trainable params: 295 (1.15 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Define SGD as the optimizer to be used
optimizer = tf.keras.optimizers.SGD()

# Recall is the chosen metric to measure
model_1.compile(loss='binary_crossentropy', optimizer=optimizer, metrics = ['Recall'])
In [ ]:
start = time.time()
history = model_1.fit(X_train, y_train, validation_data=(X_val,y_val), batch_size=32, epochs=50, class_weight=cw_dict)
end=time.time()
Epoch 1/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.8738 - loss: 0.9185 - val_Recall: 0.8739 - val_loss: 0.4082
Epoch 2/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8740 - loss: 0.6401 - val_Recall: 0.8829 - val_loss: 0.3537
Epoch 3/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.8736 - loss: 0.5927 - val_Recall: 0.8784 - val_loss: 0.3259
Epoch 4/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8747 - loss: 0.5649 - val_Recall: 0.8784 - val_loss: 0.3102
Epoch 5/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8769 - loss: 0.5461 - val_Recall: 0.8829 - val_loss: 0.2934
Epoch 6/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8804 - loss: 0.5313 - val_Recall: 0.8829 - val_loss: 0.2796
Epoch 7/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8783 - loss: 0.5157 - val_Recall: 0.8784 - val_loss: 0.2658
Epoch 8/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8872 - loss: 0.4947 - val_Recall: 0.8784 - val_loss: 0.2518
Epoch 9/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8919 - loss: 0.4708 - val_Recall: 0.8739 - val_loss: 0.2399
Epoch 10/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8944 - loss: 0.4526 - val_Recall: 0.8829 - val_loss: 0.2245
Epoch 11/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8967 - loss: 0.4364 - val_Recall: 0.8919 - val_loss: 0.2161
Epoch 12/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8975 - loss: 0.4279 - val_Recall: 0.8919 - val_loss: 0.2099
Epoch 13/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9020 - loss: 0.4236 - val_Recall: 0.8919 - val_loss: 0.2058
Epoch 14/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9020 - loss: 0.4207 - val_Recall: 0.8919 - val_loss: 0.2037
Epoch 15/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9017 - loss: 0.4185 - val_Recall: 0.8919 - val_loss: 0.2059
Epoch 16/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9010 - loss: 0.4171 - val_Recall: 0.8919 - val_loss: 0.2052
Epoch 17/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9010 - loss: 0.4155 - val_Recall: 0.8919 - val_loss: 0.2038
Epoch 18/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9002 - loss: 0.4128 - val_Recall: 0.8919 - val_loss: 0.2023
Epoch 19/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8972 - loss: 0.4115 - val_Recall: 0.8919 - val_loss: 0.2024
Epoch 20/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8964 - loss: 0.4103 - val_Recall: 0.8874 - val_loss: 0.2010
Epoch 21/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8964 - loss: 0.4091 - val_Recall: 0.8874 - val_loss: 0.2004
Epoch 22/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8964 - loss: 0.4083 - val_Recall: 0.8874 - val_loss: 0.1997
Epoch 23/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8964 - loss: 0.4077 - val_Recall: 0.8874 - val_loss: 0.1992
Epoch 24/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8972 - loss: 0.4069 - val_Recall: 0.8829 - val_loss: 0.1992
Epoch 25/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8972 - loss: 0.4056 - val_Recall: 0.8874 - val_loss: 0.1979
Epoch 26/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - Recall: 0.8980 - loss: 0.4040 - val_Recall: 0.8874 - val_loss: 0.1974
Epoch 27/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8980 - loss: 0.4029 - val_Recall: 0.8874 - val_loss: 0.1977
Epoch 28/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8965 - loss: 0.4019 - val_Recall: 0.8874 - val_loss: 0.1988
Epoch 29/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8981 - loss: 0.4009 - val_Recall: 0.8874 - val_loss: 0.1945
Epoch 30/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8981 - loss: 0.3980 - val_Recall: 0.8874 - val_loss: 0.1933
Epoch 31/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8972 - loss: 0.3966 - val_Recall: 0.8874 - val_loss: 0.1928
Epoch 32/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8972 - loss: 0.3954 - val_Recall: 0.8874 - val_loss: 0.1912
Epoch 33/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9004 - loss: 0.3934 - val_Recall: 0.8874 - val_loss: 0.1904
Epoch 34/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8998 - loss: 0.3914 - val_Recall: 0.8874 - val_loss: 0.1900
Epoch 35/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9010 - loss: 0.3900 - val_Recall: 0.8874 - val_loss: 0.1900
Epoch 36/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9039 - loss: 0.3885 - val_Recall: 0.8874 - val_loss: 0.1904
Epoch 37/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9039 - loss: 0.3872 - val_Recall: 0.8874 - val_loss: 0.1887
Epoch 38/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9059 - loss: 0.3863 - val_Recall: 0.8874 - val_loss: 0.1861
Epoch 39/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9090 - loss: 0.3845 - val_Recall: 0.8874 - val_loss: 0.1846
Epoch 40/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9090 - loss: 0.3834 - val_Recall: 0.8874 - val_loss: 0.1844
Epoch 41/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9065 - loss: 0.3827 - val_Recall: 0.8874 - val_loss: 0.1836
Epoch 42/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9069 - loss: 0.3811 - val_Recall: 0.8874 - val_loss: 0.1834
Epoch 43/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9068 - loss: 0.3805 - val_Recall: 0.8874 - val_loss: 0.1830
Epoch 44/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9089 - loss: 0.3796 - val_Recall: 0.8874 - val_loss: 0.1828
Epoch 45/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9089 - loss: 0.3785 - val_Recall: 0.8874 - val_loss: 0.1829
Epoch 46/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9089 - loss: 0.3776 - val_Recall: 0.8874 - val_loss: 0.1823
Epoch 47/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9098 - loss: 0.3766 - val_Recall: 0.8874 - val_loss: 0.1793
Epoch 48/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9098 - loss: 0.3747 - val_Recall: 0.8874 - val_loss: 0.1778
Epoch 49/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9077 - loss: 0.3731 - val_Recall: 0.8874 - val_loss: 0.1769
Epoch 50/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - Recall: 0.9074 - loss: 0.3719 - val_Recall: 0.8874 - val_loss: 0.1751
In [ ]:
print("Time taken in seconds ",end - start)
Time taken in seconds  59.48925471305847
In [ ]:
plot(history,'loss', model_1_config)
No description has been provided for this image
In [ ]:
plot(history,'Recall', model_1_config)
No description has been provided for this image
In [ ]:
model_1_train_perf = model_performance_classification(model_1, X_train, y_train)
model_1_train_perf
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 999us/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.955562 0.933018 0.778021 0.834986
In [ ]:
model_1_val_perf = model_performance_classification(model_1, X_val, y_val)
model_1_val_perf
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step  
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.95825 0.924901 0.787146 0.839934
In [ ]:
# Evaluate Model 0 performance on the test set and visualize the confusion matrix.
y_test_pred = model_1.predict(X_test)
#print(y_test_pred[0:10, :])
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for Model 1 \n" + model_1_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
[[4528  190]
 [  40  242]]
No description has been provided for this image

Observation

  • Model 1 (Increased Epochs):
    • The loss function and recall plots show minimal improvement beyond 25 epochs, suggesting convergence was largely achieved earlier.
    • Doubling the training epochs resulted in a proportional increase in training time with only marginal improvements in performance metrics across all datasets.
    • Recall on the test set improved slightly, and the number of False Negatives decreased by one compared to the baseline model.
    • Given the minimal performance gain and increased training time, further models will revert to 25 epochs.

6.5 - Model 2 - Increase batch_size¶

  • Model Configuration
    • Increase the mini-batch size to 64 on Model 0 configuration to reduce the number of parameter updates per epoch and evaluate its impact on model performance and training efficiency.
In [ ]:
# Define model configuration string to use in the title of graphs
model_2_config = "(1HL,7N,relu \n sgd \n bs64, 25E) "
In [ ]:
# Clear the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [ ]:
# Initializing the sequential model
model_2 = Sequential()

# Add the first hidden layer with 7 neurons, ReLU activation, and input dimension matching the number of features in X_train
model_2.add(Dense(7, activation="relu",input_dim=X_train.shape[1]))

# Add the output layer with 1 neuron and sigmoid activation for binary classification probability output
model_2.add(Dense(1, activation="sigmoid"))
In [ ]:
model_2.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 7)              │           287 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 1)              │             8 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 295 (1.15 KB)
 Trainable params: 295 (1.15 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Define SGD as the optimizer to be used
optimizer = tf.keras.optimizers.SGD()

# Recall is the chosen metric to measure
model_2.compile(loss='binary_crossentropy', optimizer=optimizer, metrics = ['Recall'])
In [ ]:
start = time.time()
history = model_2.fit(X_train, y_train, validation_data=(X_val,y_val), batch_size=64, epochs=25, class_weight=cw_dict)
end=time.time()
Epoch 1/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8326 - loss: 1.1976 - val_Recall: 0.8829 - val_loss: 0.4640
Epoch 2/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8828 - loss: 0.7161 - val_Recall: 0.8739 - val_loss: 0.3708
Epoch 3/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8861 - loss: 0.6216 - val_Recall: 0.8694 - val_loss: 0.3293
Epoch 4/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8920 - loss: 0.5772 - val_Recall: 0.8829 - val_loss: 0.3081
Epoch 5/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8990 - loss: 0.5504 - val_Recall: 0.9009 - val_loss: 0.2960
Epoch 6/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - Recall: 0.8978 - loss: 0.5321 - val_Recall: 0.9054 - val_loss: 0.2864
Epoch 7/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - Recall: 0.9018 - loss: 0.5175 - val_Recall: 0.9009 - val_loss: 0.2775
Epoch 8/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9021 - loss: 0.5070 - val_Recall: 0.9009 - val_loss: 0.2698
Epoch 9/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9044 - loss: 0.4991 - val_Recall: 0.8964 - val_loss: 0.2643
Epoch 10/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9090 - loss: 0.4925 - val_Recall: 0.8964 - val_loss: 0.2598
Epoch 11/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9023 - loss: 0.4865 - val_Recall: 0.8919 - val_loss: 0.2571
Epoch 12/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9025 - loss: 0.4826 - val_Recall: 0.8874 - val_loss: 0.2532
Epoch 13/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9009 - loss: 0.4783 - val_Recall: 0.8874 - val_loss: 0.2503
Epoch 14/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8968 - loss: 0.4747 - val_Recall: 0.8874 - val_loss: 0.2477
Epoch 15/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8959 - loss: 0.4713 - val_Recall: 0.8874 - val_loss: 0.2454
Epoch 16/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9010 - loss: 0.4679 - val_Recall: 0.8874 - val_loss: 0.2420
Epoch 17/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8927 - loss: 0.4639 - val_Recall: 0.8874 - val_loss: 0.2385
Epoch 18/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8988 - loss: 0.4601 - val_Recall: 0.8829 - val_loss: 0.2341
Epoch 19/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8969 - loss: 0.4560 - val_Recall: 0.8784 - val_loss: 0.2302
Epoch 20/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8971 - loss: 0.4521 - val_Recall: 0.8739 - val_loss: 0.2270
Epoch 21/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8972 - loss: 0.4481 - val_Recall: 0.8739 - val_loss: 0.2238
Epoch 22/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8981 - loss: 0.4440 - val_Recall: 0.8739 - val_loss: 0.2215
Epoch 23/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8974 - loss: 0.4395 - val_Recall: 0.8739 - val_loss: 0.2187
Epoch 24/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8987 - loss: 0.4347 - val_Recall: 0.8739 - val_loss: 0.2135
Epoch 25/25
250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9000 - loss: 0.4302 - val_Recall: 0.8739 - val_loss: 0.2118
In [ ]:
print("Time taken in seconds ",end - start)
Time taken in seconds  16.49884581565857
In [ ]:
plot(history,'loss', model_2_config)
No description has been provided for this image
In [ ]:
plot(history,'Recall', model_2_config)
No description has been provided for this image
In [ ]:
model_2_train_perf = model_performance_classification(model_2, X_train, y_train)
model_2_train_perf
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.935187 0.919052 0.725615 0.785717
In [ ]:
model_2_val_perf = model_performance_classification(model_2, X_val, y_val)
model_2_val_perf
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step  
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.9375 0.907556 0.729267 0.787096
In [ ]:
# Evaluate model performance on the test set and visualize the confusion matrix.
y_test_pred = model_2.predict(X_test)
#print(y_test_pred[0:10, :])
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for Model 2 \n" + model_2_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
[[4436  282]
 [  42  240]]
No description has been provided for this image

Observation

  • Model 2 (Increased batch_size):
    • Increasing the batch size to 64 resulted in a slight decrease in model performance across evaluation metrics.
    • Specifically, a marginal reduction in recall was observed on both the training and validation sets.
    • Based on these findings, subsequent model iterations will utilize a batch size of 32 and be trained for 25 epochs.

6.6 - Model 3 - Increasing Neurons in the First Hidden Layer¶

  • Model Configuration
    • Increase the number of neurons in the first hidden layer to explore a more complex model capacity.
In [ ]:
# Define model configuration string to use in the title of graphs
model_3_config = "(1HL,21N,relu \n sgd \n bs32, 25E) "
In [ ]:
# Define the batch size and epochs upfront to use the same values for all models
epochs = 25
batch_size = 32
In [ ]:
# Clear the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [ ]:
# Initializing the sequential model
model_3 = Sequential()

# Add the first hidden layer with 7 neurons, ReLU activation, and input dimension matching the number of features in X_train
model_3.add(Dense(21, activation="relu",input_dim=X_train.shape[1]))

# Add the output layer with 1 neuron and sigmoid activation for binary classification probability output
model_3.add(Dense(1, activation="sigmoid"))
In [ ]:
model_3.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 21)             │           861 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 1)              │            22 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 883 (3.45 KB)
 Trainable params: 883 (3.45 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Define SGD as the optimizer to be used
optimizer = tf.keras.optimizers.SGD()

# Recall is the chosen metric to measure
model_3.compile(loss='binary_crossentropy', optimizer=optimizer, metrics = ['Recall'])
In [ ]:
start = time.time()
history = model_3.fit(X_train, y_train, validation_data=(X_val,y_val), batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 5s 5ms/step - Recall: 0.8085 - loss: 0.9271 - val_Recall: 0.8964 - val_loss: 0.3242
Epoch 2/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 5ms/step - Recall: 0.9049 - loss: 0.5349 - val_Recall: 0.8829 - val_loss: 0.2617
Epoch 3/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9092 - loss: 0.4641 - val_Recall: 0.8874 - val_loss: 0.2344
Epoch 4/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9088 - loss: 0.4273 - val_Recall: 0.8919 - val_loss: 0.2185
Epoch 5/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9062 - loss: 0.4064 - val_Recall: 0.8919 - val_loss: 0.2085
Epoch 6/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9013 - loss: 0.3941 - val_Recall: 0.8919 - val_loss: 0.2010
Epoch 7/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9035 - loss: 0.3848 - val_Recall: 0.8919 - val_loss: 0.1968
Epoch 8/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.9035 - loss: 0.3774 - val_Recall: 0.8919 - val_loss: 0.1916
Epoch 9/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9035 - loss: 0.3715 - val_Recall: 0.8919 - val_loss: 0.1880
Epoch 10/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9056 - loss: 0.3667 - val_Recall: 0.8919 - val_loss: 0.1847
Epoch 11/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9071 - loss: 0.3621 - val_Recall: 0.8919 - val_loss: 0.1823
Epoch 12/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9056 - loss: 0.3580 - val_Recall: 0.8964 - val_loss: 0.1795
Epoch 13/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9071 - loss: 0.3542 - val_Recall: 0.8919 - val_loss: 0.1783
Epoch 14/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9063 - loss: 0.3508 - val_Recall: 0.8919 - val_loss: 0.1768
Epoch 15/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9080 - loss: 0.3481 - val_Recall: 0.8919 - val_loss: 0.1745
Epoch 16/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9084 - loss: 0.3448 - val_Recall: 0.8919 - val_loss: 0.1723
Epoch 17/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9084 - loss: 0.3419 - val_Recall: 0.8919 - val_loss: 0.1692
Epoch 18/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9084 - loss: 0.3391 - val_Recall: 0.8919 - val_loss: 0.1669
Epoch 19/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - Recall: 0.9103 - loss: 0.3367 - val_Recall: 0.8919 - val_loss: 0.1650
Epoch 20/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9116 - loss: 0.3336 - val_Recall: 0.8919 - val_loss: 0.1629
Epoch 21/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9116 - loss: 0.3316 - val_Recall: 0.8919 - val_loss: 0.1621
Epoch 22/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9116 - loss: 0.3288 - val_Recall: 0.8919 - val_loss: 0.1616
Epoch 23/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9116 - loss: 0.3276 - val_Recall: 0.8919 - val_loss: 0.1607
Epoch 24/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9116 - loss: 0.3250 - val_Recall: 0.8919 - val_loss: 0.1597
Epoch 25/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9116 - loss: 0.3227 - val_Recall: 0.8919 - val_loss: 0.1588
In [ ]:
print("Time taken in seconds ",end - start)
Time taken in seconds  34.553746700286865
In [ ]:
plot(history,'loss', model_3_config)
No description has been provided for this image
In [ ]:
plot(history,'Recall', model_3_config)
No description has been provided for this image
In [ ]:
model_3_train_perf = model_performance_classification(model_3, X_train, y_train)
model_3_train_perf
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.961125 0.941792 0.796185 0.851689
In [ ]:
model_3_val_perf = model_performance_classification(model_3, X_val, y_val)
model_3_val_perf
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.96025 0.928079 0.794025 0.846078
In [ ]:
# Evaluate model performance on the test set and visualize the confusion matrix.
y_test_pred = model_3.predict(X_test)
y_test_pred[0:10, :]
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for Model 3 \n" + model_3_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step  
[[4553  165]
 [  37  245]]
No description has been provided for this image

Observation

  • Increasing the complexity by increasing neurons in the hidden layer
    • The loss function shows a steep decrease in the initial epochs and then stabilizes, indicating convergence more or less similar to the baseline model.
    • Recall has improved, and the number of false negatives has decreased.
    • The training time remains nominal.
    • This configuration for the first hidden layer will be used in subsequent models.

6.7 - Model 4 - Change Activation in the First Hidden Layer¶

  • Model Configuration
    • Change the activation function in the first hidden layer to leaky_relu and observe the performance
In [ ]:
# Define model configuration string to use in the title of graphs
model_4_config = "(1HL,21N,leaky_relu \n sgd \n bs32, 25E) "
In [ ]:
# Define the batch size and epochs upfront to use the same values for all models
epochs = 25
batch_size = 32
In [ ]:
# Clear the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [ ]:
# Initializing the sequential model
model_4 = Sequential()

# Add the first hidden layer with 7 neurons, Leaky ReLU activation, and input dimension matching the number of features in X_train
model_4.add(Dense(21, activation="leaky_relu",input_dim=X_train.shape[1]))

# Add the output layer with 1 neuron and sigmoid activation for binary classification probability output
model_4.add(Dense(1, activation="sigmoid"))
In [ ]:
model_4.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 21)             │           861 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 1)              │            22 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 883 (3.45 KB)
 Trainable params: 883 (3.45 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Define SGD as the optimizer to be used
optimizer = tf.keras.optimizers.SGD()

# Recall is the chosen metric to measure
model_4.compile(loss='binary_crossentropy', optimizer=optimizer, metrics = ['Recall'])
In [ ]:
start = time.time()
history = model_4.fit(X_train, y_train, validation_data=(X_val,y_val), batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.7609 - loss: 1.0298 - val_Recall: 0.8919 - val_loss: 0.3720
Epoch 2/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8901 - loss: 0.5994 - val_Recall: 0.8919 - val_loss: 0.3134
Epoch 3/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9071 - loss: 0.5279 - val_Recall: 0.8919 - val_loss: 0.2797
Epoch 4/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9112 - loss: 0.4831 - val_Recall: 0.8964 - val_loss: 0.2583
Epoch 5/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.9070 - loss: 0.4558 - val_Recall: 0.9009 - val_loss: 0.2433
Epoch 6/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9026 - loss: 0.4364 - val_Recall: 0.9009 - val_loss: 0.2323
Epoch 7/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9033 - loss: 0.4213 - val_Recall: 0.9009 - val_loss: 0.2232
Epoch 8/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9033 - loss: 0.4104 - val_Recall: 0.9054 - val_loss: 0.2182
Epoch 9/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8987 - loss: 0.4008 - val_Recall: 0.9054 - val_loss: 0.2117
Epoch 10/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9031 - loss: 0.3923 - val_Recall: 0.9054 - val_loss: 0.2086
Epoch 11/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9031 - loss: 0.3841 - val_Recall: 0.9009 - val_loss: 0.2049
Epoch 12/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9042 - loss: 0.3774 - val_Recall: 0.9009 - val_loss: 0.2028
Epoch 13/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9054 - loss: 0.3727 - val_Recall: 0.9009 - val_loss: 0.2002
Epoch 14/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9054 - loss: 0.3680 - val_Recall: 0.9009 - val_loss: 0.1984
Epoch 15/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.9057 - loss: 0.3649 - val_Recall: 0.9009 - val_loss: 0.1954
Epoch 16/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9057 - loss: 0.3602 - val_Recall: 0.9009 - val_loss: 0.1942
Epoch 17/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9067 - loss: 0.3575 - val_Recall: 0.9009 - val_loss: 0.1933
Epoch 18/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9074 - loss: 0.3545 - val_Recall: 0.9009 - val_loss: 0.1931
Epoch 19/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9074 - loss: 0.3517 - val_Recall: 0.9009 - val_loss: 0.1925
Epoch 20/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9074 - loss: 0.3489 - val_Recall: 0.9009 - val_loss: 0.1916
Epoch 21/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9074 - loss: 0.3464 - val_Recall: 0.9009 - val_loss: 0.1910
Epoch 22/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9083 - loss: 0.3438 - val_Recall: 0.9009 - val_loss: 0.1903
Epoch 23/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9083 - loss: 0.3415 - val_Recall: 0.9009 - val_loss: 0.1889
Epoch 24/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9083 - loss: 0.3395 - val_Recall: 0.9009 - val_loss: 0.1887
Epoch 25/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9083 - loss: 0.3378 - val_Recall: 0.9009 - val_loss: 0.1883
In [ ]:
print("Time taken in seconds ",end - start)
Time taken in seconds  30.310200452804565
In [ ]:
plot(history,'loss', model_4_config)
No description has been provided for this image
In [ ]:
plot(history,'Recall', model_4_config)
No description has been provided for this image
In [ ]:
model_4_train_perf = model_performance_classification(model_4, X_train, y_train)
model_4_train_perf
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.9475 0.933519 0.755054 0.815737
In [ ]:
model_4_val_perf = model_performance_classification(model_4, X_val, y_val)
model_4_val_perf
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step  
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.95 0.926893 0.761513 0.81982
In [ ]:
# Evaluate model performance on the test set and visualize the confusion matrix.
y_test_pred = model_4.predict(X_test)
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for Model 4 \n" + model_4_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
[[4495  223]
 [  41  241]]
No description has been provided for this image

Observation

  • Changing the activation function in the hidden layer hasn't changed much in training and validation recall, while the loss curve remains largely unchanged.
  • Number of false negatives have increased
  • leaky_relu will be used for the first hidden layer in subsequent models.

6.8 - Model 5 - Add additional hidden layer¶

  • Model Configuration
    • Add additional layer over Model 4 and observe the performance
In [ ]:
# Define model configuration string to use in the title of graphs
model_5_config = "(1HL,21N,leaky_relu + 2HL,14N,relu \n sgd \n bs32, 25E) "
In [ ]:
# Clear the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [ ]:
# Initializing the sequential model
model_5 = Sequential()

# Add the first hidden layer with 7 neurons, ReLU activation, and input dimension matching the number of features in X_train
model_5.add(Dense(21, activation="leaky_relu",input_dim=X_train.shape[1]))

# Additional hidden layer with 14 neurons, ReLU activation
model_5.add(Dense(14, activation="relu"))

# Add the output layer with 1 neuron and sigmoid activation for binary classification probability output
model_5.add(Dense(1, activation="sigmoid"))
In [ ]:
model_5.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 21)             │           861 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 14)             │           308 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 1)              │            15 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 1,184 (4.62 KB)
 Trainable params: 1,184 (4.62 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Define SGD as the optimizer to be used
optimizer = tf.keras.optimizers.SGD()

# Recall is the chosen metric to measure
model_5.compile(loss='binary_crossentropy', optimizer=optimizer, metrics = ['Recall'])
In [ ]:
start = time.time()
history = model_5.fit(X_train, y_train, validation_data=(X_val,y_val) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - Recall: 0.7955 - loss: 0.8487 - val_Recall: 0.8874 - val_loss: 0.3219
Epoch 2/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8749 - loss: 0.5399 - val_Recall: 0.8829 - val_loss: 0.2637
Epoch 3/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8766 - loss: 0.4656 - val_Recall: 0.8829 - val_loss: 0.2300
Epoch 4/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9067 - loss: 0.4176 - val_Recall: 0.8919 - val_loss: 0.2206
Epoch 5/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9068 - loss: 0.3899 - val_Recall: 0.8964 - val_loss: 0.2150
Epoch 6/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9070 - loss: 0.3739 - val_Recall: 0.8964 - val_loss: 0.1920
Epoch 7/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9095 - loss: 0.3627 - val_Recall: 0.9009 - val_loss: 0.1865
Epoch 8/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9099 - loss: 0.3527 - val_Recall: 0.9009 - val_loss: 0.1832
Epoch 9/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9109 - loss: 0.3447 - val_Recall: 0.9009 - val_loss: 0.1805
Epoch 10/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9147 - loss: 0.3377 - val_Recall: 0.9009 - val_loss: 0.1790
Epoch 11/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9147 - loss: 0.3306 - val_Recall: 0.8964 - val_loss: 0.1763
Epoch 12/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9126 - loss: 0.3251 - val_Recall: 0.8964 - val_loss: 0.1697
Epoch 13/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.9154 - loss: 0.3193 - val_Recall: 0.8964 - val_loss: 0.1688
Epoch 14/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9154 - loss: 0.3133 - val_Recall: 0.8964 - val_loss: 0.1658
Epoch 15/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9186 - loss: 0.3084 - val_Recall: 0.8964 - val_loss: 0.1641
Epoch 16/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9186 - loss: 0.3041 - val_Recall: 0.8964 - val_loss: 0.1631
Epoch 17/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9195 - loss: 0.2998 - val_Recall: 0.8964 - val_loss: 0.1580
Epoch 18/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9195 - loss: 0.2945 - val_Recall: 0.8964 - val_loss: 0.1527
Epoch 19/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9196 - loss: 0.2915 - val_Recall: 0.8964 - val_loss: 0.1525
Epoch 20/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9197 - loss: 0.2874 - val_Recall: 0.8964 - val_loss: 0.1490
Epoch 21/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9197 - loss: 0.2842 - val_Recall: 0.8964 - val_loss: 0.1519
Epoch 22/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9197 - loss: 0.2819 - val_Recall: 0.8964 - val_loss: 0.1488
Epoch 23/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9197 - loss: 0.2777 - val_Recall: 0.9009 - val_loss: 0.1469
Epoch 24/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9205 - loss: 0.2754 - val_Recall: 0.9009 - val_loss: 0.1477
Epoch 25/25
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.9209 - loss: 0.2738 - val_Recall: 0.9009 - val_loss: 0.1459
In [ ]:
print("Time taken in seconds ",end - start)
Time taken in seconds  32.00727105140686
In [ ]:
plot(history,'loss', model_5_config)
No description has been provided for this image
In [ ]:
plot(history,'Recall', model_5_config)
No description has been provided for this image
In [ ]:
model_5_train_perf = model_performance_classification(model_5,X_train,y_train)
model_5_train_perf
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step  
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.970875 0.949073 0.834209 0.881674
In [ ]:
model_5_val_perf = model_performance_classification(model_5,X_val,y_val)
model_5_val_perf
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step  
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.9705 0.937745 0.834868 0.878215
In [ ]:
# Evaluate model performance on the test set and visualize the confusion matrix.
y_test_pred = model_5.predict(X_test)
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for Model 5 \n" + model_5_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
[[4586  132]
 [  36  246]]
No description has been provided for this image

Observation

  • Introducing a second hidden layer with 14 neurons and ReLU activation in Model 5 resulted in a marginal improvement in recall on the test set compared to Model 4.
  • This architectural change also led to a slight reduction in the number of false negatives on the test set.

6.9 - Model 6 - Increase Epochs¶

  • Model Configuration
    • Increase the number of training epochs for Model 5 to allow for further learning and potential performance improvement.
In [ ]:
# Define model configuration string to use in the title of graphs
model_6_config = "(1HL,21N,leaky_relu + 2HL,14N,relu \n sgd \n bs32, 50E) "
In [ ]:
epochs = 50
In [ ]:
# Clear the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [ ]:
# Initializing the sequential model
model_6 = Sequential()

# Add the first hidden layer with 7 neurons, ReLU activation, and input dimension matching the number of features in X_train
model_6.add(Dense(21, activation="leaky_relu",input_dim=X_train.shape[1]))

# Additional hidden layer with 14 neurons, ReLU activation
model_6.add(Dense(14, activation="relu"))

# Add the output layer with 1 neuron and sigmoid activation for binary classification probability output
model_6.add(Dense(1, activation="sigmoid"))
In [ ]:
model_6.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 21)             │           861 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 14)             │           308 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 1)              │            15 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 1,184 (4.62 KB)
 Trainable params: 1,184 (4.62 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Define SGD as the optimizer to be used
optimizer = tf.keras.optimizers.SGD()

# Recall is the chosen metric to measure
model_6.compile(loss='binary_crossentropy', optimizer=optimizer, metrics = ['Recall'])
In [ ]:
start = time.time()
history = model_6.fit(X_train, y_train, validation_data=(X_val,y_val) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.7448 - loss: 0.9799 - val_Recall: 0.8919 - val_loss: 0.3449
Epoch 2/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8984 - loss: 0.5276 - val_Recall: 0.8874 - val_loss: 0.2748
Epoch 3/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9002 - loss: 0.4526 - val_Recall: 0.8874 - val_loss: 0.2463
Epoch 4/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9034 - loss: 0.4143 - val_Recall: 0.8874 - val_loss: 0.2286
Epoch 5/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9040 - loss: 0.3904 - val_Recall: 0.8874 - val_loss: 0.2253
Epoch 6/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9058 - loss: 0.3751 - val_Recall: 0.8874 - val_loss: 0.2149
Epoch 7/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9049 - loss: 0.3623 - val_Recall: 0.8874 - val_loss: 0.2106
Epoch 8/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9070 - loss: 0.3508 - val_Recall: 0.8919 - val_loss: 0.2074
Epoch 9/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9089 - loss: 0.3410 - val_Recall: 0.8964 - val_loss: 0.2024
Epoch 10/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9106 - loss: 0.3307 - val_Recall: 0.8964 - val_loss: 0.1861
Epoch 11/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9110 - loss: 0.3208 - val_Recall: 0.9009 - val_loss: 0.1870
Epoch 12/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9110 - loss: 0.3134 - val_Recall: 0.9009 - val_loss: 0.1873
Epoch 13/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9123 - loss: 0.3070 - val_Recall: 0.9009 - val_loss: 0.1825
Epoch 14/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9141 - loss: 0.3012 - val_Recall: 0.8964 - val_loss: 0.1805
Epoch 15/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9141 - loss: 0.2974 - val_Recall: 0.8964 - val_loss: 0.1817
Epoch 16/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9141 - loss: 0.2917 - val_Recall: 0.8964 - val_loss: 0.1763
Epoch 17/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9172 - loss: 0.2885 - val_Recall: 0.9009 - val_loss: 0.1690
Epoch 18/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9183 - loss: 0.2835 - val_Recall: 0.8964 - val_loss: 0.1639
Epoch 19/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9159 - loss: 0.2803 - val_Recall: 0.9009 - val_loss: 0.1621
Epoch 20/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9186 - loss: 0.2770 - val_Recall: 0.9009 - val_loss: 0.1590
Epoch 21/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9186 - loss: 0.2738 - val_Recall: 0.9009 - val_loss: 0.1585
Epoch 22/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9182 - loss: 0.2705 - val_Recall: 0.9009 - val_loss: 0.1496
Epoch 23/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9182 - loss: 0.2669 - val_Recall: 0.8964 - val_loss: 0.1507
Epoch 24/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9191 - loss: 0.2642 - val_Recall: 0.8919 - val_loss: 0.1453
Epoch 25/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9191 - loss: 0.2625 - val_Recall: 0.9009 - val_loss: 0.1445
Epoch 26/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9199 - loss: 0.2589 - val_Recall: 0.8919 - val_loss: 0.1481
Epoch 27/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9191 - loss: 0.2578 - val_Recall: 0.9009 - val_loss: 0.1464
Epoch 28/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9199 - loss: 0.2562 - val_Recall: 0.8919 - val_loss: 0.1438
Epoch 29/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9199 - loss: 0.2534 - val_Recall: 0.8919 - val_loss: 0.1469
Epoch 30/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9199 - loss: 0.2500 - val_Recall: 0.8919 - val_loss: 0.1439
Epoch 31/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9202 - loss: 0.2499 - val_Recall: 0.8874 - val_loss: 0.1399
Epoch 32/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9212 - loss: 0.2453 - val_Recall: 0.8874 - val_loss: 0.1434
Epoch 33/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9214 - loss: 0.2440 - val_Recall: 0.8874 - val_loss: 0.1398
Epoch 34/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9223 - loss: 0.2422 - val_Recall: 0.8874 - val_loss: 0.1372
Epoch 35/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9228 - loss: 0.2394 - val_Recall: 0.8874 - val_loss: 0.1410
Epoch 36/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9238 - loss: 0.2382 - val_Recall: 0.8874 - val_loss: 0.1369
Epoch 37/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9233 - loss: 0.2348 - val_Recall: 0.8874 - val_loss: 0.1352
Epoch 38/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9236 - loss: 0.2361 - val_Recall: 0.8874 - val_loss: 0.1371
Epoch 39/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9250 - loss: 0.2326 - val_Recall: 0.8874 - val_loss: 0.1364
Epoch 40/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9247 - loss: 0.2309 - val_Recall: 0.8874 - val_loss: 0.1384
Epoch 41/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9258 - loss: 0.2287 - val_Recall: 0.8874 - val_loss: 0.1351
Epoch 42/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9250 - loss: 0.2263 - val_Recall: 0.8874 - val_loss: 0.1378
Epoch 43/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9258 - loss: 0.2248 - val_Recall: 0.8919 - val_loss: 0.1348
Epoch 44/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9250 - loss: 0.2239 - val_Recall: 0.8919 - val_loss: 0.1377
Epoch 45/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9258 - loss: 0.2206 - val_Recall: 0.8919 - val_loss: 0.1333
Epoch 46/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9266 - loss: 0.2205 - val_Recall: 0.8919 - val_loss: 0.1373
Epoch 47/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9279 - loss: 0.2173 - val_Recall: 0.8919 - val_loss: 0.1333
Epoch 48/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9266 - loss: 0.2168 - val_Recall: 0.8919 - val_loss: 0.1356
Epoch 49/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9294 - loss: 0.2139 - val_Recall: 0.8919 - val_loss: 0.1342
Epoch 50/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9286 - loss: 0.2121 - val_Recall: 0.8919 - val_loss: 0.1363
In [ ]:
print("Time taken in seconds ",end - start)
Time taken in seconds  58.47771239280701
In [ ]:
plot(history,'loss', model_6_config)
No description has been provided for this image
In [ ]:
plot(history,'Recall', model_6_config)
No description has been provided for this image
In [ ]:
model_6_train_perf = model_performance_classification(model_6,X_train,y_train)
model_6_train_perf
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.972063 0.955002 0.838366 0.886548
In [ ]:
model_6_val_perf = model_performance_classification(model_6,X_val,y_val)
model_6_val_perf
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.9675 0.931917 0.822411 0.867729
In [ ]:
# Evaluate model performance on the test set and visualize the confusion matrix.
y_test_pred = model_6.predict(X_test)
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for Model 6 \n" + model_6_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
[[4587  131]
 [  35  247]]
No description has been provided for this image

Observation

  • Increasing the epochs has slightly improved performance and reduced false negatives in test set

6.10 - Model 7 - Initialize weights using he_normal¶

  • Model Configuration
    • Initialize weights using he_normal and observe performance.
In [ ]:
# Define model configuration string to use in the title of graphs
model_7_config = "(1HL,21N,leaky_relu,he_normal + 2HL,14N,relu,he_normal \n sgd \n bs32, 50E) "
In [ ]:
# Clear the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [ ]:
# Initializing the sequential model
model_7 = Sequential()

# Add the first hidden layer with 7 neurons, ReLU activation, and input dimension matching the number of features in X_train
model_7.add(Dense(21, activation="leaky_relu",kernel_initializer='he_normal', input_dim=X_train.shape[1]))

# Additional hidden layer with 14 neurons, ReLU activation
model_7.add(Dense(14, activation="relu", kernel_initializer='he_normal'))

# Add the output layer with 1 neuron and sigmoid activation for binary classification probability output
model_7.add(Dense(1, activation="sigmoid"))
In [ ]:
model_7.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 21)             │           861 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 14)             │           308 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 1)              │            15 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 1,184 (4.62 KB)
 Trainable params: 1,184 (4.62 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Define SGD as the optimizer to be used
optimizer = tf.keras.optimizers.SGD()

# Recall is the chosen metric to measure
model_7.compile(loss='binary_crossentropy', optimizer=optimizer, metrics = ['Recall'])
In [ ]:
start = time.time()
history = model_7.fit(X_train, y_train, validation_data=(X_val,y_val) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.6520 - loss: 1.0908 - val_Recall: 0.8874 - val_loss: 0.3304
Epoch 2/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9033 - loss: 0.5369 - val_Recall: 0.8784 - val_loss: 0.2714
Epoch 3/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9021 - loss: 0.4698 - val_Recall: 0.8829 - val_loss: 0.2307
Epoch 4/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8985 - loss: 0.4297 - val_Recall: 0.8874 - val_loss: 0.2076
Epoch 5/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8947 - loss: 0.3996 - val_Recall: 0.8919 - val_loss: 0.1924
Epoch 6/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.9018 - loss: 0.3780 - val_Recall: 0.8964 - val_loss: 0.1853
Epoch 7/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9007 - loss: 0.3582 - val_Recall: 0.9009 - val_loss: 0.1906
Epoch 8/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9087 - loss: 0.3415 - val_Recall: 0.8964 - val_loss: 0.1807
Epoch 9/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9085 - loss: 0.3282 - val_Recall: 0.8964 - val_loss: 0.1748
Epoch 10/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9096 - loss: 0.3159 - val_Recall: 0.8964 - val_loss: 0.1721
Epoch 11/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9077 - loss: 0.3081 - val_Recall: 0.8964 - val_loss: 0.1703
Epoch 12/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9104 - loss: 0.3006 - val_Recall: 0.8964 - val_loss: 0.1631
Epoch 13/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9119 - loss: 0.2959 - val_Recall: 0.8964 - val_loss: 0.1573
Epoch 14/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9119 - loss: 0.2905 - val_Recall: 0.8919 - val_loss: 0.1489
Epoch 15/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9131 - loss: 0.2852 - val_Recall: 0.8919 - val_loss: 0.1500
Epoch 16/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9150 - loss: 0.2797 - val_Recall: 0.8964 - val_loss: 0.1503
Epoch 17/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9139 - loss: 0.2773 - val_Recall: 0.8964 - val_loss: 0.1426
Epoch 18/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9168 - loss: 0.2722 - val_Recall: 0.8964 - val_loss: 0.1424
Epoch 19/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9189 - loss: 0.2683 - val_Recall: 0.8964 - val_loss: 0.1372
Epoch 20/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9202 - loss: 0.2667 - val_Recall: 0.8964 - val_loss: 0.1395
Epoch 21/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9198 - loss: 0.2622 - val_Recall: 0.8964 - val_loss: 0.1348
Epoch 22/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9167 - loss: 0.2609 - val_Recall: 0.8964 - val_loss: 0.1333
Epoch 23/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9208 - loss: 0.2579 - val_Recall: 0.9009 - val_loss: 0.1274
Epoch 24/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9198 - loss: 0.2557 - val_Recall: 0.9009 - val_loss: 0.1335
Epoch 25/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9208 - loss: 0.2508 - val_Recall: 0.9054 - val_loss: 0.1289
Epoch 26/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9177 - loss: 0.2499 - val_Recall: 0.9009 - val_loss: 0.1231
Epoch 27/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9200 - loss: 0.2465 - val_Recall: 0.9054 - val_loss: 0.1290
Epoch 28/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9209 - loss: 0.2446 - val_Recall: 0.9054 - val_loss: 0.1220
Epoch 29/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9192 - loss: 0.2421 - val_Recall: 0.9099 - val_loss: 0.1298
Epoch 30/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9219 - loss: 0.2404 - val_Recall: 0.9099 - val_loss: 0.1275
Epoch 31/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9230 - loss: 0.2401 - val_Recall: 0.9099 - val_loss: 0.1259
Epoch 32/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9230 - loss: 0.2376 - val_Recall: 0.9099 - val_loss: 0.1238
Epoch 33/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9211 - loss: 0.2375 - val_Recall: 0.9054 - val_loss: 0.1293
Epoch 34/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9196 - loss: 0.2360 - val_Recall: 0.9099 - val_loss: 0.1200
Epoch 35/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9217 - loss: 0.2336 - val_Recall: 0.9099 - val_loss: 0.1202
Epoch 36/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9217 - loss: 0.2334 - val_Recall: 0.9099 - val_loss: 0.1237
Epoch 37/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9219 - loss: 0.2303 - val_Recall: 0.9099 - val_loss: 0.1305
Epoch 38/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.9219 - loss: 0.2346 - val_Recall: 0.9099 - val_loss: 0.1223
Epoch 39/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9224 - loss: 0.2298 - val_Recall: 0.9099 - val_loss: 0.1273
Epoch 40/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9222 - loss: 0.2305 - val_Recall: 0.9144 - val_loss: 0.1270
Epoch 41/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9206 - loss: 0.2273 - val_Recall: 0.9144 - val_loss: 0.1269
Epoch 42/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9214 - loss: 0.2276 - val_Recall: 0.9144 - val_loss: 0.1278
Epoch 43/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9215 - loss: 0.2234 - val_Recall: 0.9144 - val_loss: 0.1374
Epoch 44/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9209 - loss: 0.2255 - val_Recall: 0.9099 - val_loss: 0.1243
Epoch 45/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9216 - loss: 0.2237 - val_Recall: 0.9144 - val_loss: 0.1381
Epoch 46/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9248 - loss: 0.2221 - val_Recall: 0.9099 - val_loss: 0.1297
Epoch 47/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9224 - loss: 0.2264 - val_Recall: 0.9144 - val_loss: 0.1310
Epoch 48/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9232 - loss: 0.2216 - val_Recall: 0.9099 - val_loss: 0.1195
Epoch 49/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9224 - loss: 0.2197 - val_Recall: 0.9144 - val_loss: 0.1249
Epoch 50/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9224 - loss: 0.2145 - val_Recall: 0.9144 - val_loss: 0.1236
In [ ]:
print("Time taken in seconds ",end - start)
Time taken in seconds  64.6731915473938
In [ ]:
plot(history,'loss', model_7_config)
No description has been provided for this image
In [ ]:
plot(history,'Recall', model_7_config)
No description has been provided for this image
In [ ]:
model_7_train_perf = model_performance_classification(model_7,X_train,y_train)
model_7_train_perf
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.976437 0.953608 0.860204 0.900625
In [ ]:
model_7_val_perf = model_performance_classification(model_7,X_val,y_val)
model_7_val_perf
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.97475 0.946355 0.853583 0.893655
In [ ]:
# Evaluate model performance on the test set and visualize the confusion matrix.
y_test_pred = model_7.predict(X_test)
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for Model 7 \n" + model_7_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
[[4601  117]
 [  37  245]]
No description has been provided for this image

Observation

  • Initializing weights with he_normal for Model 7 shows marginal improvements in precision and F1-score on both training and validation sets compared to Model 6, while recall remains largely similar.
  • The loss curve remains largely unchanged, indicating that the weight initialization strategy did not significantly impact the overall training convergence pattern.
  • The number of false negatives has increased slightly on the test set compared to Model 6.

6.11 - Model 8 - Add momentum¶

  • Model Configuration
    • Use SGD with momentum and observe performance.
In [ ]:
# Define model configuration string to use in the title of graphs
model_8_config = "(1HL,21N,leaky_relu,he_normal + 2HL,14N,relu,he_normal \n sgd,mom \n bs32, 50E) "
In [ ]:
# Clear the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [ ]:
# Initializing the sequential model
model_8 = Sequential()

# Add the first hidden layer with 7 neurons, ReLU activation, and input dimension matching the number of features in X_train
model_8.add(Dense(21, activation="leaky_relu",kernel_initializer='he_normal', input_dim=X_train.shape[1]))

# Additional hidden layer with 14 neurons, ReLU activation
model_8.add(Dense(14, activation="relu", kernel_initializer='he_normal'))

# Add the output layer with 1 neuron and sigmoid activation for binary classification probability output
model_8.add(Dense(1, activation="sigmoid"))
In [ ]:
model_8.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 21)             │           861 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 14)             │           308 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 1)              │            15 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 1,184 (4.62 KB)
 Trainable params: 1,184 (4.62 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Define SGD as the optimizer to be used
optimizer = tf.keras.optimizers.SGD(momentum=0.5)

# Recall is the chosen metric to measure
model_8.compile(loss='binary_crossentropy', optimizer=optimizer, metrics = ['Recall'])
In [ ]:
start = time.time()
history = model_8.fit(X_train, y_train, validation_data=(X_val,y_val) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - Recall: 0.8591 - loss: 0.8143 - val_Recall: 0.8829 - val_loss: 0.3043
Epoch 2/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8942 - loss: 0.4942 - val_Recall: 0.8874 - val_loss: 0.2508
Epoch 3/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8921 - loss: 0.4228 - val_Recall: 0.8964 - val_loss: 0.2333
Epoch 4/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8986 - loss: 0.3806 - val_Recall: 0.8964 - val_loss: 0.2093
Epoch 5/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9030 - loss: 0.3491 - val_Recall: 0.9009 - val_loss: 0.2155
Epoch 6/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9081 - loss: 0.3322 - val_Recall: 0.9009 - val_loss: 0.1975
Epoch 7/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9099 - loss: 0.3137 - val_Recall: 0.9009 - val_loss: 0.1866
Epoch 8/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9161 - loss: 0.2999 - val_Recall: 0.9009 - val_loss: 0.1846
Epoch 9/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9177 - loss: 0.2901 - val_Recall: 0.8919 - val_loss: 0.1737
Epoch 10/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9251 - loss: 0.2812 - val_Recall: 0.8964 - val_loss: 0.1751
Epoch 11/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9251 - loss: 0.2773 - val_Recall: 0.8874 - val_loss: 0.1751
Epoch 12/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9265 - loss: 0.2714 - val_Recall: 0.8874 - val_loss: 0.1779
Epoch 13/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9251 - loss: 0.2672 - val_Recall: 0.8874 - val_loss: 0.1767
Epoch 14/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.9263 - loss: 0.2656 - val_Recall: 0.8874 - val_loss: 0.1775
Epoch 15/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9281 - loss: 0.2618 - val_Recall: 0.8874 - val_loss: 0.1793
Epoch 16/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9247 - loss: 0.2583 - val_Recall: 0.8874 - val_loss: 0.1738
Epoch 17/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9281 - loss: 0.2585 - val_Recall: 0.8874 - val_loss: 0.1791
Epoch 18/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9270 - loss: 0.2554 - val_Recall: 0.8874 - val_loss: 0.1665
Epoch 19/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9270 - loss: 0.2539 - val_Recall: 0.8874 - val_loss: 0.1828
Epoch 20/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9295 - loss: 0.2533 - val_Recall: 0.8874 - val_loss: 0.1674
Epoch 21/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9295 - loss: 0.2535 - val_Recall: 0.8919 - val_loss: 0.1803
Epoch 22/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9303 - loss: 0.2495 - val_Recall: 0.8919 - val_loss: 0.1723
Epoch 23/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9301 - loss: 0.2477 - val_Recall: 0.8919 - val_loss: 0.1873
Epoch 24/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9295 - loss: 0.2472 - val_Recall: 0.8964 - val_loss: 0.1929
Epoch 25/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9304 - loss: 0.2395 - val_Recall: 0.8964 - val_loss: 0.1829
Epoch 26/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9284 - loss: 0.2478 - val_Recall: 0.8919 - val_loss: 0.1853
Epoch 27/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9316 - loss: 0.2416 - val_Recall: 0.8919 - val_loss: 0.1815
Epoch 28/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9317 - loss: 0.2375 - val_Recall: 0.8964 - val_loss: 0.2383
Epoch 29/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9311 - loss: 0.2373 - val_Recall: 0.9009 - val_loss: 0.1683
Epoch 30/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9303 - loss: 0.2347 - val_Recall: 0.9054 - val_loss: 0.1941
Epoch 31/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9311 - loss: 0.2263 - val_Recall: 0.8964 - val_loss: 0.2034
Epoch 32/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9348 - loss: 0.2265 - val_Recall: 0.8964 - val_loss: 0.1751
Epoch 33/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9302 - loss: 0.2256 - val_Recall: 0.9009 - val_loss: 0.1666
Epoch 34/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9302 - loss: 0.2282 - val_Recall: 0.9009 - val_loss: 0.2040
Epoch 35/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9327 - loss: 0.2219 - val_Recall: 0.8874 - val_loss: 0.1708
Epoch 36/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9322 - loss: 0.2174 - val_Recall: 0.8964 - val_loss: 0.3038
Epoch 37/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9332 - loss: 0.2340 - val_Recall: 0.9009 - val_loss: 0.1715
Epoch 38/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9313 - loss: 0.2193 - val_Recall: 0.8964 - val_loss: 0.1743
Epoch 39/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9355 - loss: 0.2147 - val_Recall: 0.8964 - val_loss: 0.1714
Epoch 40/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9330 - loss: 0.2112 - val_Recall: 0.9009 - val_loss: 0.2067
Epoch 41/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9391 - loss: 0.2105 - val_Recall: 0.9009 - val_loss: 0.1787
Epoch 42/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9326 - loss: 0.2123 - val_Recall: 0.9054 - val_loss: 0.2252
Epoch 43/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9362 - loss: 0.2029 - val_Recall: 0.9009 - val_loss: 0.1644
Epoch 44/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9376 - loss: 0.2073 - val_Recall: 0.9009 - val_loss: 0.1882
Epoch 45/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9420 - loss: 0.2015 - val_Recall: 0.8964 - val_loss: 0.1878
Epoch 46/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9369 - loss: 0.2081 - val_Recall: 0.8964 - val_loss: 0.2051
Epoch 47/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9428 - loss: 0.2002 - val_Recall: 0.8964 - val_loss: 0.1873
Epoch 48/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9379 - loss: 0.2031 - val_Recall: 0.8964 - val_loss: 0.1594
Epoch 49/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9358 - loss: 0.1992 - val_Recall: 0.8964 - val_loss: 0.2066
Epoch 50/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9430 - loss: 0.1951 - val_Recall: 0.9009 - val_loss: 0.1766
In [ ]:
print("Time taken in seconds ",end - start)
Time taken in seconds  62.5771689414978
In [ ]:
plot(history,'loss', model_8_config)
No description has been provided for this image
In [ ]:
plot(history,'Recall', model_8_config)
No description has been provided for this image
In [ ]:
model_8_train_perf = model_performance_classification(model_8,X_train,y_train)
model_8_train_perf
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step  
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.970375 0.953048 0.8312 0.880902
In [ ]:
model_8_val_perf = model_performance_classification(model_8,X_val,y_val)
model_8_val_perf
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.963 0.933775 0.803754 0.855033
In [ ]:
# Evaluate model performance on the test set and visualize the confusion matrix.
y_test_pred = model_8.predict(X_test)
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for Model 8 \n" + model_8_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
[[4570  148]
 [  40  242]]
No description has been provided for this image

Observation

  • Adding momentum to the SGD optimizer in Model 8 improved training recall but did not lead to a consistent improvement in validation recall compared to Model 7.
  • The loss curves show some oscillation, particularly in the validation loss, which does not consistently decrease.
  • The number of false negatives on the test set remained the same as Model 7.

6.12 - Model 9 - Add third hidden layer¶

  • Model Configuration
    • Add a third hidden layer with 7 neurons and ReLU activation.
In [ ]:
# Define model configuration string to use in the title of graphs
model_9_config = "(1HL,21N,leaky_relu,he_normal + 2HL,14N,relu,he_normal + 3HL,7N,relu \n sgd,mom \n bs32 + 50E) "
In [ ]:
# Clear the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [ ]:
# Initializing the sequential model
model_9 = Sequential()

# Add the first hidden layer with 7 neurons, ReLU activation, and input dimension matching the number of features in X_train
model_9.add(Dense(21, activation="leaky_relu",kernel_initializer='he_normal', input_dim=X_train.shape[1]))

# Additional hidden layer with 14 neurons, ReLU activation
model_9.add(Dense(14, activation="relu", kernel_initializer='he_normal'))

# Additional hidden layer with 7 neurons, ReLU activation
model_9.add(Dense(7, activation="relu"))

# Add the output layer with 1 neuron and sigmoid activation for binary classification probability output
model_9.add(Dense(1, activation="sigmoid"))
In [ ]:
model_9.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 21)             │           861 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 14)             │           308 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 7)              │           105 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_3 (Dense)                 │ (None, 1)              │             8 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 1,282 (5.01 KB)
 Trainable params: 1,282 (5.01 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Define SGD as the optimizer to be used
optimizer = tf.keras.optimizers.SGD(momentum=0.5)

# Recall is the chosen metric to measure
model_9.compile(loss='binary_crossentropy', optimizer=optimizer, metrics = ['Recall'])
In [ ]:
start = time.time()
history = model_9.fit(X_train, y_train, validation_data=(X_val,y_val) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8232 - loss: 0.9220 - val_Recall: 0.8829 - val_loss: 0.3043
Epoch 2/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8943 - loss: 0.4756 - val_Recall: 0.9009 - val_loss: 0.2192
Epoch 3/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8934 - loss: 0.3922 - val_Recall: 0.9009 - val_loss: 0.1814
Epoch 4/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9028 - loss: 0.3498 - val_Recall: 0.8919 - val_loss: 0.1604
Epoch 5/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9046 - loss: 0.3268 - val_Recall: 0.8964 - val_loss: 0.1597
Epoch 6/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9061 - loss: 0.3134 - val_Recall: 0.9009 - val_loss: 0.1547
Epoch 7/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9099 - loss: 0.3021 - val_Recall: 0.9009 - val_loss: 0.1591
Epoch 8/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9136 - loss: 0.2956 - val_Recall: 0.8964 - val_loss: 0.1515
Epoch 9/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9182 - loss: 0.2811 - val_Recall: 0.8919 - val_loss: 0.1419
Epoch 10/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9157 - loss: 0.2791 - val_Recall: 0.8919 - val_loss: 0.1507
Epoch 11/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9170 - loss: 0.2721 - val_Recall: 0.8964 - val_loss: 0.1637
Epoch 12/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9192 - loss: 0.2653 - val_Recall: 0.8964 - val_loss: 0.1606
Epoch 13/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9192 - loss: 0.2625 - val_Recall: 0.8874 - val_loss: 0.1580
Epoch 14/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9177 - loss: 0.2586 - val_Recall: 0.8874 - val_loss: 0.1598
Epoch 15/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9206 - loss: 0.2532 - val_Recall: 0.8874 - val_loss: 0.1524
Epoch 16/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9204 - loss: 0.2469 - val_Recall: 0.8919 - val_loss: 0.1496
Epoch 17/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9206 - loss: 0.2401 - val_Recall: 0.8874 - val_loss: 0.1653
Epoch 18/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9246 - loss: 0.2390 - val_Recall: 0.8964 - val_loss: 0.1396
Epoch 19/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9225 - loss: 0.2337 - val_Recall: 0.8919 - val_loss: 0.1567
Epoch 20/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.9261 - loss: 0.2249 - val_Recall: 0.8964 - val_loss: 0.1398
Epoch 21/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9268 - loss: 0.2228 - val_Recall: 0.8919 - val_loss: 0.1649
Epoch 22/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9243 - loss: 0.2387 - val_Recall: 0.8964 - val_loss: 0.1550
Epoch 23/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9268 - loss: 0.2203 - val_Recall: 0.9054 - val_loss: 0.1731
Epoch 24/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9260 - loss: 0.2185 - val_Recall: 0.9009 - val_loss: 0.1750
Epoch 25/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9269 - loss: 0.2207 - val_Recall: 0.8964 - val_loss: 0.2070
Epoch 26/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9268 - loss: 0.2190 - val_Recall: 0.8919 - val_loss: 0.1528
Epoch 27/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9294 - loss: 0.2139 - val_Recall: 0.8919 - val_loss: 0.1404
Epoch 28/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9291 - loss: 0.2168 - val_Recall: 0.8964 - val_loss: 0.1594
Epoch 29/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9283 - loss: 0.2046 - val_Recall: 0.8964 - val_loss: 0.1777
Epoch 30/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9355 - loss: 0.2010 - val_Recall: 0.9009 - val_loss: 0.1685
Epoch 31/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9288 - loss: 0.2037 - val_Recall: 0.9054 - val_loss: 0.2213
Epoch 32/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9332 - loss: 0.2301 - val_Recall: 0.8964 - val_loss: 0.1237
Epoch 33/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9282 - loss: 0.2191 - val_Recall: 0.8874 - val_loss: 0.1611
Epoch 34/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9335 - loss: 0.1995 - val_Recall: 0.8964 - val_loss: 0.1617
Epoch 35/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9308 - loss: 0.2044 - val_Recall: 0.8829 - val_loss: 0.2052
Epoch 36/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9346 - loss: 0.1950 - val_Recall: 0.8964 - val_loss: 0.1413
Epoch 37/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9311 - loss: 0.2012 - val_Recall: 0.8784 - val_loss: 0.1749
Epoch 38/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9339 - loss: 0.1893 - val_Recall: 0.8829 - val_loss: 0.1218
Epoch 39/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9342 - loss: 0.1905 - val_Recall: 0.8919 - val_loss: 0.1253
Epoch 40/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9310 - loss: 0.1880 - val_Recall: 0.8874 - val_loss: 0.1757
Epoch 41/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9331 - loss: 0.2019 - val_Recall: 0.8874 - val_loss: 0.1547
Epoch 42/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9323 - loss: 0.1963 - val_Recall: 0.9009 - val_loss: 0.2254
Epoch 43/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9311 - loss: 0.2026 - val_Recall: 0.8919 - val_loss: 0.1338
Epoch 44/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9317 - loss: 0.1979 - val_Recall: 0.8919 - val_loss: 0.1129
Epoch 45/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9340 - loss: 0.1778 - val_Recall: 0.8874 - val_loss: 0.1278
Epoch 46/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9312 - loss: 0.1882 - val_Recall: 0.8874 - val_loss: 0.2645
Epoch 47/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9315 - loss: 0.2059 - val_Recall: 0.9009 - val_loss: 0.1324
Epoch 48/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9347 - loss: 0.1806 - val_Recall: 0.8829 - val_loss: 0.1219
Epoch 49/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9365 - loss: 0.1837 - val_Recall: 0.8919 - val_loss: 0.1329
Epoch 50/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9417 - loss: 0.1686 - val_Recall: 0.8829 - val_loss: 0.1485
In [ ]:
print("Time taken in seconds ",end - start)
Time taken in seconds  62.6905243396759
In [ ]:
plot(history,'loss', model_9_config)
No description has been provided for this image
In [ ]:
plot(history,'Recall', model_9_config)
No description has been provided for this image
In [ ]:
model_9_train_perf = model_performance_classification(model_9,X_train,y_train)
model_9_train_perf
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.982187 0.952412 0.894097 0.920928
In [ ]:
model_9_val_perf = model_performance_classification(model_9,X_val,y_val)
model_9_val_perf
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.979 0.933765 0.882356 0.906183
In [ ]:
# Evaluate model performance on the test set and visualize the confusion matrix.
y_test_pred = model_9.predict(X_test)
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for Model 9 \n" + model_9_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
[[4622   96]
 [  41  241]]
No description has been provided for this image

Observation

  • Adding the third layer has not shown much improvement in performance
  • The gap between training and validation recall appears to be slightly widening, potentially indicating the beginning of overfitting.

6.13 - Model 10 - Add dropout layer¶

  • Model Configuration
    • Add dropout layer to see if it addresses overfitting issue
In [ ]:
# Define model configuration string to use in the title of graphs
model_10_config = "(1HL,21N,leaky_relu,he_normal + DP.4 + 2HL,14N,relu,he_normal + 3HL,7N,relu \n sgd,mom \n bs32 + 50E) "
In [ ]:
# Clear the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [ ]:
# Initializing the sequential model
model_10 = Sequential()

# Add the first hidden layer with 7 neurons, ReLU activation, and input dimension matching the number of features in X_train
model_10.add(Dense(21, activation="leaky_relu",kernel_initializer='he_normal', input_dim=X_train.shape[1]))

# Drop 40%
model_10.add(Dropout(0.4))

# Additional hidden layer with 14 neurons, ReLU activation
model_10.add(Dense(14, activation="relu", kernel_initializer='he_normal'))

# Additional hidden layer with 7 neurons, ReLU activation
model_10.add(Dense(7, activation="relu"))

# Add the output layer with 1 neuron and sigmoid activation for binary classification probability output
model_10.add(Dense(1, activation="sigmoid"))
In [ ]:
model_10.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 21)             │           861 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 21)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 14)             │           308 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 7)              │           105 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_3 (Dense)                 │ (None, 1)              │             8 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 1,282 (5.01 KB)
 Trainable params: 1,282 (5.01 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Define SGD as the optimizer to be used
optimizer = tf.keras.optimizers.SGD(momentum=0.5)

# Recall is the chosen metric to measure
model_10.compile(loss='binary_crossentropy', optimizer=optimizer, metrics = ['Recall'])
In [ ]:
start = time.time()
history = model_10.fit(X_train, y_train, validation_data=(X_val,y_val) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.7453 - loss: 1.1225 - val_Recall: 0.9054 - val_loss: 0.4248
Epoch 2/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8470 - loss: 0.6650 - val_Recall: 0.8874 - val_loss: 0.2901
Epoch 3/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8483 - loss: 0.5884 - val_Recall: 0.8874 - val_loss: 0.2426
Epoch 4/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8644 - loss: 0.5189 - val_Recall: 0.8739 - val_loss: 0.2412
Epoch 5/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8694 - loss: 0.5012 - val_Recall: 0.8784 - val_loss: 0.2362
Epoch 6/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8719 - loss: 0.4947 - val_Recall: 0.8829 - val_loss: 0.2286
Epoch 7/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8860 - loss: 0.4810 - val_Recall: 0.8829 - val_loss: 0.2314
Epoch 8/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8836 - loss: 0.4854 - val_Recall: 0.8784 - val_loss: 0.1991
Epoch 9/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8941 - loss: 0.4626 - val_Recall: 0.8784 - val_loss: 0.1847
Epoch 10/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8759 - loss: 0.4710 - val_Recall: 0.8784 - val_loss: 0.1896
Epoch 11/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8820 - loss: 0.4572 - val_Recall: 0.8874 - val_loss: 0.1845
Epoch 12/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8788 - loss: 0.4828 - val_Recall: 0.8784 - val_loss: 0.1726
Epoch 13/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8825 - loss: 0.4510 - val_Recall: 0.8784 - val_loss: 0.1647
Epoch 14/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8904 - loss: 0.4412 - val_Recall: 0.8829 - val_loss: 0.1579
Epoch 15/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8884 - loss: 0.4346 - val_Recall: 0.8784 - val_loss: 0.1558
Epoch 16/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8882 - loss: 0.4213 - val_Recall: 0.8829 - val_loss: 0.1500
Epoch 17/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8822 - loss: 0.4289 - val_Recall: 0.8739 - val_loss: 0.1480
Epoch 18/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8994 - loss: 0.4049 - val_Recall: 0.8784 - val_loss: 0.1457
Epoch 19/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8806 - loss: 0.4343 - val_Recall: 0.8874 - val_loss: 0.1579
Epoch 20/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8934 - loss: 0.4151 - val_Recall: 0.8739 - val_loss: 0.1386
Epoch 21/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8938 - loss: 0.4139 - val_Recall: 0.8829 - val_loss: 0.1420
Epoch 22/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8876 - loss: 0.4082 - val_Recall: 0.8829 - val_loss: 0.1504
Epoch 23/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8891 - loss: 0.4051 - val_Recall: 0.8784 - val_loss: 0.1425
Epoch 24/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - Recall: 0.8888 - loss: 0.4203 - val_Recall: 0.8784 - val_loss: 0.1314
Epoch 25/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8912 - loss: 0.4048 - val_Recall: 0.8829 - val_loss: 0.1386
Epoch 26/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8921 - loss: 0.3869 - val_Recall: 0.8784 - val_loss: 0.1474
Epoch 27/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8973 - loss: 0.4005 - val_Recall: 0.8874 - val_loss: 0.1456
Epoch 28/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8951 - loss: 0.3906 - val_Recall: 0.8739 - val_loss: 0.1202
Epoch 29/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9020 - loss: 0.3756 - val_Recall: 0.8784 - val_loss: 0.1357
Epoch 30/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8965 - loss: 0.3864 - val_Recall: 0.8829 - val_loss: 0.1306
Epoch 31/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8903 - loss: 0.4003 - val_Recall: 0.8694 - val_loss: 0.1236
Epoch 32/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8901 - loss: 0.3923 - val_Recall: 0.8874 - val_loss: 0.1452
Epoch 33/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9008 - loss: 0.3830 - val_Recall: 0.8919 - val_loss: 0.1406
Epoch 34/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8860 - loss: 0.3959 - val_Recall: 0.8829 - val_loss: 0.1293
Epoch 35/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9006 - loss: 0.3716 - val_Recall: 0.8919 - val_loss: 0.1299
Epoch 36/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8894 - loss: 0.3802 - val_Recall: 0.8874 - val_loss: 0.1226
Epoch 37/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8767 - loss: 0.4103 - val_Recall: 0.8874 - val_loss: 0.1309
Epoch 38/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9071 - loss: 0.3606 - val_Recall: 0.8964 - val_loss: 0.1416
Epoch 39/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9031 - loss: 0.3770 - val_Recall: 0.9009 - val_loss: 0.1334
Epoch 40/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9038 - loss: 0.3770 - val_Recall: 0.9009 - val_loss: 0.1289
Epoch 41/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9032 - loss: 0.3500 - val_Recall: 0.8829 - val_loss: 0.1129
Epoch 42/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9076 - loss: 0.3650 - val_Recall: 0.8829 - val_loss: 0.1213
Epoch 43/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9062 - loss: 0.3653 - val_Recall: 0.8874 - val_loss: 0.1249
Epoch 44/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - Recall: 0.9007 - loss: 0.3690 - val_Recall: 0.8874 - val_loss: 0.1302
Epoch 45/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.9038 - loss: 0.3645 - val_Recall: 0.8829 - val_loss: 0.1128
Epoch 46/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9044 - loss: 0.3501 - val_Recall: 0.8919 - val_loss: 0.1255
Epoch 47/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9007 - loss: 0.3645 - val_Recall: 0.8919 - val_loss: 0.1221
Epoch 48/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9033 - loss: 0.3569 - val_Recall: 0.8919 - val_loss: 0.1135
Epoch 49/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8954 - loss: 0.3506 - val_Recall: 0.8919 - val_loss: 0.1183
Epoch 50/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9015 - loss: 0.3607 - val_Recall: 0.8874 - val_loss: 0.1128
In [ ]:
print("Time taken in seconds ",end - start)
Time taken in seconds  69.95645785331726
In [ ]:
plot(history,'loss', model_10_config)
No description has been provided for this image
In [ ]:
plot(history,'Recall', model_10_config)
No description has been provided for this image
In [ ]:
model_10_train_perf = model_performance_classification(model_10,X_train,y_train)
model_10_train_perf
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.989938 0.950685 0.953087 0.951882
In [ ]:
model_10_val_perf = model_performance_classification(model_10,X_val,y_val)
model_10_val_perf
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step  
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.99025 0.941841 0.963526 0.952388
In [ ]:
# Evaluate model performance on the test set and visualize the confusion matrix.
y_test_pred = model_10.predict(X_test)
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for Model 10 \n" + model_10_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
[[4696   22]
 [  38  244]]
No description has been provided for this image

Observation

  • The introduction of the Dropout layer effectively reduced the gap between training and val recall, suggesting improved generalization and reduced overfitting.
  • The number of False Negatives observed on the test set has slightly reduced.

6.14 - Model 11 - Change the optimizer to Adam¶

  • Model Configuration
    • Change the optimizer to Adam
In [ ]:
# Define model configuration string to use in the title of graphs
model_11_config = "(1HL,21N,leaky_relu,he_normal + 2HL,14N,relu,he_normal + 3HL,7N,relu \n adam \n bs32,50E) "
In [ ]:
# Clear the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [ ]:
# Initializing the sequential model
model_11 = Sequential()

# Add the first hidden layer with 7 neurons, ReLU activation, and input dimension matching the number of features in X_train
model_11.add(Dense(21, activation="leaky_relu",kernel_initializer='he_normal', input_dim=X_train.shape[1]))

# Additional hidden layer with 14 neurons, ReLU activation
model_11.add(Dense(14, activation="relu", kernel_initializer='he_normal'))

# Additional hidden layer with 7 neurons, ReLU activation
model_11.add(Dense(7, activation="relu"))

# Add the output layer with 1 neuron and sigmoid activation for binary classification probability output
model_11.add(Dense(1, activation="sigmoid"))
In [ ]:
model_11.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 21)             │           861 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 14)             │           308 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 7)              │           105 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_3 (Dense)                 │ (None, 1)              │             8 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 1,282 (5.01 KB)
 Trainable params: 1,282 (5.01 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Define Adam as the optimizer to be used
optimizer = tf.keras.optimizers.Adam()

# Recall is the chosen metric to measure
model_11.compile(loss='binary_crossentropy', optimizer=optimizer, metrics = ['Recall'])
In [ ]:
start = time.time()
history = model_11.fit(X_train, y_train, validation_data=(X_val,y_val) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - Recall: 0.5831 - loss: 1.0448 - val_Recall: 0.9009 - val_loss: 0.2974
Epoch 2/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9005 - loss: 0.4562 - val_Recall: 0.8874 - val_loss: 0.2335
Epoch 3/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9034 - loss: 0.3772 - val_Recall: 0.8874 - val_loss: 0.2034
Epoch 4/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9103 - loss: 0.3410 - val_Recall: 0.8919 - val_loss: 0.1757
Epoch 5/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9141 - loss: 0.3159 - val_Recall: 0.8919 - val_loss: 0.1692
Epoch 6/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9127 - loss: 0.2968 - val_Recall: 0.8919 - val_loss: 0.1637
Epoch 7/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - Recall: 0.9166 - loss: 0.2842 - val_Recall: 0.8964 - val_loss: 0.1608
Epoch 8/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9187 - loss: 0.2722 - val_Recall: 0.8919 - val_loss: 0.1777
Epoch 9/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9218 - loss: 0.2662 - val_Recall: 0.8919 - val_loss: 0.1718
Epoch 10/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9230 - loss: 0.2601 - val_Recall: 0.8964 - val_loss: 0.1743
Epoch 11/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9270 - loss: 0.2557 - val_Recall: 0.8874 - val_loss: 0.1704
Epoch 12/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9270 - loss: 0.2484 - val_Recall: 0.8874 - val_loss: 0.1673
Epoch 13/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9273 - loss: 0.2415 - val_Recall: 0.8919 - val_loss: 0.1687
Epoch 14/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9281 - loss: 0.2390 - val_Recall: 0.8874 - val_loss: 0.1769
Epoch 15/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9283 - loss: 0.2370 - val_Recall: 0.8874 - val_loss: 0.1673
Epoch 16/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.9302 - loss: 0.2303 - val_Recall: 0.8874 - val_loss: 0.1721
Epoch 17/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.9338 - loss: 0.2269 - val_Recall: 0.8919 - val_loss: 0.1642
Epoch 18/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9352 - loss: 0.2250 - val_Recall: 0.8874 - val_loss: 0.1650
Epoch 19/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9329 - loss: 0.2253 - val_Recall: 0.8919 - val_loss: 0.1554
Epoch 20/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9357 - loss: 0.2207 - val_Recall: 0.8874 - val_loss: 0.1692
Epoch 21/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9346 - loss: 0.2181 - val_Recall: 0.8874 - val_loss: 0.1718
Epoch 22/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9373 - loss: 0.2189 - val_Recall: 0.8919 - val_loss: 0.1715
Epoch 23/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9380 - loss: 0.2183 - val_Recall: 0.8874 - val_loss: 0.1662
Epoch 24/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9373 - loss: 0.2152 - val_Recall: 0.8919 - val_loss: 0.1689
Epoch 25/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9346 - loss: 0.2164 - val_Recall: 0.8874 - val_loss: 0.1620
Epoch 26/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9380 - loss: 0.2115 - val_Recall: 0.8964 - val_loss: 0.1727
Epoch 27/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9339 - loss: 0.2096 - val_Recall: 0.8919 - val_loss: 0.1557
Epoch 28/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9379 - loss: 0.2091 - val_Recall: 0.9009 - val_loss: 0.1576
Epoch 29/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9347 - loss: 0.2043 - val_Recall: 0.8964 - val_loss: 0.1658
Epoch 30/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9373 - loss: 0.2051 - val_Recall: 0.8874 - val_loss: 0.1450
Epoch 31/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9392 - loss: 0.1989 - val_Recall: 0.8964 - val_loss: 0.1529
Epoch 32/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9376 - loss: 0.2015 - val_Recall: 0.8919 - val_loss: 0.1413
Epoch 33/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9349 - loss: 0.2002 - val_Recall: 0.8964 - val_loss: 0.1524
Epoch 34/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9392 - loss: 0.1976 - val_Recall: 0.8964 - val_loss: 0.1443
Epoch 35/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9375 - loss: 0.1943 - val_Recall: 0.8919 - val_loss: 0.1484
Epoch 36/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9383 - loss: 0.1951 - val_Recall: 0.8919 - val_loss: 0.1338
Epoch 37/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - Recall: 0.9350 - loss: 0.1948 - val_Recall: 0.8919 - val_loss: 0.1406
Epoch 38/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.9375 - loss: 0.1916 - val_Recall: 0.8919 - val_loss: 0.1331
Epoch 39/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9381 - loss: 0.1873 - val_Recall: 0.9009 - val_loss: 0.1555
Epoch 40/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9392 - loss: 0.1883 - val_Recall: 0.8874 - val_loss: 0.1265
Epoch 41/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9392 - loss: 0.1906 - val_Recall: 0.8919 - val_loss: 0.1292
Epoch 42/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9393 - loss: 0.1866 - val_Recall: 0.8919 - val_loss: 0.1241
Epoch 43/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9392 - loss: 0.1802 - val_Recall: 0.8874 - val_loss: 0.1250
Epoch 44/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9361 - loss: 0.1826 - val_Recall: 0.8919 - val_loss: 0.1416
Epoch 45/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9345 - loss: 0.1845 - val_Recall: 0.8874 - val_loss: 0.1344
Epoch 46/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.9383 - loss: 0.1816 - val_Recall: 0.8874 - val_loss: 0.1249
Epoch 47/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9389 - loss: 0.1760 - val_Recall: 0.8874 - val_loss: 0.1244
Epoch 48/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.9392 - loss: 0.1692 - val_Recall: 0.8874 - val_loss: 0.1323
Epoch 49/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9393 - loss: 0.1733 - val_Recall: 0.8874 - val_loss: 0.1259
Epoch 50/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9393 - loss: 0.1705 - val_Recall: 0.8829 - val_loss: 0.1274
In [ ]:
print("Time taken in seconds ",end - start)
Time taken in seconds  71.20552659034729
In [ ]:
plot(history,'loss', model_11_config)
No description has been provided for this image
In [ ]:
plot(history,'Recall', model_11_config)
No description has been provided for this image
In [ ]:
model_11_train_perf = model_performance_classification(model_11,X_train,y_train)
model_11_train_perf
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.98575 0.955358 0.916728 0.935058
In [ ]:
model_11_val_perf = model_performance_classification(model_11,X_val,y_val)
model_11_val_perf
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.98225 0.935486 0.903181 0.918616
In [ ]:
# Evaluate model performance on the test set and visualize the confusion matrix.
y_test_pred = model_11.predict(X_test)
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for Model 11 \n" + model_11_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
[[4654   64]
 [  39  243]]
No description has been provided for this image

Observation

  • Using the Adam optimizer in Model 11 resulted in a decrease in the number of false negatives on the test set (33 compared to 43 in Model 9), which is a desirable outcome based on the problem's evaluation criteria.
  • The training and validation loss curves are smoother compared to the SGD optimizer, indicating more stable convergence during training.
  • The validation recall for Model 11 (93.31%) is slightly lower than Model 9 (92.74%), while the test recall for Model 11 is 88.30% compared to 84.75% for Model 9.

6.15 - Model 12 - Introduce Regularization (dropout)¶

  • Model Configuration
    • Introduce regularization by adding a Dropout layer after the first hidden layer to mitigate overfitting.
In [ ]:
# Define model configuration string to use in the title of graphs
model_12_config = "(1HL,21N,leaky_relu,he_normal + DR.4 + 2HL,14N,relu,he_normal + 3HL,7N,relu \n adam \n bs32,50E) "
In [ ]:
# Clear the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [ ]:
# Initializing the sequential model
model_12 = Sequential()

# Add the first hidden layer with 7 neurons, ReLU activation, and input dimension matching the number of features in X_train
model_12.add(Dense(21, activation="leaky_relu",kernel_initializer='he_normal', input_dim=X_train.shape[1]))

# Dropout 40%
model_12.add(Dropout(0.4))

# Additional hidden layer with 14 neurons, ReLU activation
model_12.add(Dense(14, activation="relu", kernel_initializer='he_normal'))

# Additional hidden layer with 7 neurons, ReLU activation
model_12.add(Dense(7, activation="relu"))

# Add the output layer with 1 neuron and sigmoid activation for binary classification probability output
model_12.add(Dense(1, activation="sigmoid"))
In [ ]:
model_12.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 21)             │           861 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 21)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 14)             │           308 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 7)              │           105 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_3 (Dense)                 │ (None, 1)              │             8 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 1,282 (5.01 KB)
 Trainable params: 1,282 (5.01 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Define Adam as the optimizer to be used
optimizer = tf.keras.optimizers.Adam()

# Recall is the chosen metric to measure
model_12.compile(loss='binary_crossentropy', optimizer=optimizer, metrics = ['Recall'])
In [ ]:
start = time.time()
history = model_12.fit(X_train, y_train, validation_data=(X_val,y_val) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 4s 4ms/step - Recall: 0.7183 - loss: 1.1537 - val_Recall: 0.8649 - val_loss: 0.3780
Epoch 2/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8464 - loss: 0.7515 - val_Recall: 0.8739 - val_loss: 0.2993
Epoch 3/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8576 - loss: 0.6437 - val_Recall: 0.8649 - val_loss: 0.2517
Epoch 4/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8679 - loss: 0.5789 - val_Recall: 0.8694 - val_loss: 0.2385
Epoch 5/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8766 - loss: 0.5429 - val_Recall: 0.8784 - val_loss: 0.2020
Epoch 6/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8672 - loss: 0.5259 - val_Recall: 0.8649 - val_loss: 0.1790
Epoch 7/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8726 - loss: 0.4857 - val_Recall: 0.8829 - val_loss: 0.1841
Epoch 8/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8832 - loss: 0.4809 - val_Recall: 0.8784 - val_loss: 0.1860
Epoch 9/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8751 - loss: 0.4905 - val_Recall: 0.8694 - val_loss: 0.1547
Epoch 10/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8898 - loss: 0.4657 - val_Recall: 0.8739 - val_loss: 0.1560
Epoch 11/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8864 - loss: 0.4651 - val_Recall: 0.8784 - val_loss: 0.1626
Epoch 12/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8804 - loss: 0.4611 - val_Recall: 0.8784 - val_loss: 0.1557
Epoch 13/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8844 - loss: 0.4465 - val_Recall: 0.8739 - val_loss: 0.1369
Epoch 14/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8806 - loss: 0.4629 - val_Recall: 0.8694 - val_loss: 0.1448
Epoch 15/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8898 - loss: 0.4448 - val_Recall: 0.8784 - val_loss: 0.1457
Epoch 16/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8910 - loss: 0.4275 - val_Recall: 0.8739 - val_loss: 0.1330
Epoch 17/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8741 - loss: 0.4444 - val_Recall: 0.8694 - val_loss: 0.1339
Epoch 18/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8823 - loss: 0.4410 - val_Recall: 0.8739 - val_loss: 0.1446
Epoch 19/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8900 - loss: 0.4375 - val_Recall: 0.8649 - val_loss: 0.1325
Epoch 20/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8877 - loss: 0.4216 - val_Recall: 0.8784 - val_loss: 0.1335
Epoch 21/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8826 - loss: 0.4402 - val_Recall: 0.8784 - val_loss: 0.1324
Epoch 22/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8945 - loss: 0.4195 - val_Recall: 0.8739 - val_loss: 0.1287
Epoch 23/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8839 - loss: 0.4218 - val_Recall: 0.8784 - val_loss: 0.1329
Epoch 24/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8969 - loss: 0.4100 - val_Recall: 0.8784 - val_loss: 0.1299
Epoch 25/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8785 - loss: 0.4161 - val_Recall: 0.8829 - val_loss: 0.1419
Epoch 26/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8914 - loss: 0.4254 - val_Recall: 0.8694 - val_loss: 0.1334
Epoch 27/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8743 - loss: 0.4237 - val_Recall: 0.8694 - val_loss: 0.1300
Epoch 28/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8941 - loss: 0.4205 - val_Recall: 0.8694 - val_loss: 0.1236
Epoch 29/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8836 - loss: 0.4152 - val_Recall: 0.8739 - val_loss: 0.1352
Epoch 30/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8975 - loss: 0.4016 - val_Recall: 0.8739 - val_loss: 0.1271
Epoch 31/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8962 - loss: 0.3924 - val_Recall: 0.8694 - val_loss: 0.1239
Epoch 32/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8944 - loss: 0.3962 - val_Recall: 0.8874 - val_loss: 0.1336
Epoch 33/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8799 - loss: 0.4100 - val_Recall: 0.8784 - val_loss: 0.1263
Epoch 34/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8863 - loss: 0.4129 - val_Recall: 0.8649 - val_loss: 0.1254
Epoch 35/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9003 - loss: 0.3893 - val_Recall: 0.8649 - val_loss: 0.1323
Epoch 36/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8737 - loss: 0.4135 - val_Recall: 0.8694 - val_loss: 0.1246
Epoch 37/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8970 - loss: 0.3744 - val_Recall: 0.8694 - val_loss: 0.1218
Epoch 38/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8911 - loss: 0.3894 - val_Recall: 0.8649 - val_loss: 0.1234
Epoch 39/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8931 - loss: 0.3827 - val_Recall: 0.8784 - val_loss: 0.1192
Epoch 40/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8945 - loss: 0.3652 - val_Recall: 0.8829 - val_loss: 0.1315
Epoch 41/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8880 - loss: 0.4078 - val_Recall: 0.8784 - val_loss: 0.1185
Epoch 42/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8900 - loss: 0.3929 - val_Recall: 0.8694 - val_loss: 0.1073
Epoch 43/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9090 - loss: 0.3451 - val_Recall: 0.8694 - val_loss: 0.1150
Epoch 44/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8897 - loss: 0.3599 - val_Recall: 0.8649 - val_loss: 0.1206
Epoch 45/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8912 - loss: 0.3755 - val_Recall: 0.8604 - val_loss: 0.1152
Epoch 46/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8953 - loss: 0.3632 - val_Recall: 0.8829 - val_loss: 0.1165
Epoch 47/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8940 - loss: 0.3791 - val_Recall: 0.8694 - val_loss: 0.1238
Epoch 48/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9010 - loss: 0.3582 - val_Recall: 0.8694 - val_loss: 0.1123
Epoch 49/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9010 - loss: 0.3467 - val_Recall: 0.8649 - val_loss: 0.1135
Epoch 50/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - Recall: 0.8979 - loss: 0.3545 - val_Recall: 0.8649 - val_loss: 0.1063
In [ ]:
print("Time taken in seconds ",end - start)
Time taken in seconds  72.59007692337036
In [ ]:
plot(history,'loss', model_12_config)
No description has been provided for this image
In [ ]:
plot(history,'Recall', model_12_config)
No description has been provided for this image
In [ ]:
model_12_train_perf = model_performance_classification(model_12,X_train,y_train)
model_12_train_perf
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.9895 0.940914 0.957439 0.949003
In [ ]:
model_12_val_perf = model_performance_classification(model_12,X_val,y_val)
model_12_val_perf
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step  
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.98775 0.929918 0.951017 0.94018
In [ ]:
# Evaluate model performance on the test set and visualize the confusion matrix.
y_test_pred = model_12.predict(X_test)
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for Model 12 \n" + model_12_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
[[4703   15]
 [  43  239]]
No description has been provided for this image

Observation

  • Dropout has helped to reduce the gap between training and validation recall to reduce the overfitting.
  • Overall, not much improvement in the performace in terms of recall.

6.16 - Model 13 - Introduce learning_rate¶

  • Model Configuration
    • Adjust Learning Rate: Modify the learning rate of the Adam optimizer to influence the step size taken during parameter updates. A carefully chosen learning rate can help the model converge more smoothly and avoid oscillations in the loss function, leading to more stable training and potentially better performance.
In [ ]:
# Define model configuration string to use in the title of graphs
model_13_config = "(1HL,21N,leaky_relu,he_normal + DR.4 + 2HL,14N,relu,he_normal + 3HL,7N,relu \n adam,lr.0001 \n bs32, 50E) "
In [ ]:
# Clear the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [ ]:
# Initializing the sequential model
model_13 = Sequential()

# Add the first hidden layer with 7 neurons, ReLU activation, and input dimension matching the number of features in X_train
model_13.add(Dense(21, activation="leaky_relu",kernel_initializer='he_normal', input_dim=X_train.shape[1]))

# Dropout 40%
model_13.add(Dropout(0.4))

# Additional hidden layer with 14 neurons, ReLU activation
model_13.add(Dense(14, activation="relu", kernel_initializer='he_normal'))

# Additional hidden layer with 7 neurons, ReLU activation
model_13.add(Dense(7, activation="relu"))

# Add the output layer with 1 neuron and sigmoid activation for binary classification probability output
model_13.add(Dense(1, activation="sigmoid"))
In [ ]:
model_13.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 21)             │           861 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 21)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 14)             │           308 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 7)              │           105 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_3 (Dense)                 │ (None, 1)              │             8 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 1,282 (5.01 KB)
 Trainable params: 1,282 (5.01 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Define Adam as the optimizer to be used
optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001)

# Recall is the chosen metric to measure
model_13.compile(loss='binary_crossentropy', optimizer=optimizer, metrics = ['Recall'])
In [ ]:
start = time.time()
history = model_13.fit(X_train, y_train, validation_data=(X_val,y_val) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - Recall: 0.7767 - loss: 1.4486 - val_Recall: 0.8829 - val_loss: 0.6958
Epoch 2/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8139 - loss: 1.2734 - val_Recall: 0.9144 - val_loss: 0.6463
Epoch 3/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8269 - loss: 1.1683 - val_Recall: 0.9324 - val_loss: 0.6153
Epoch 4/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - Recall: 0.8659 - loss: 1.0798 - val_Recall: 0.9189 - val_loss: 0.5845
Epoch 5/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8602 - loss: 1.0152 - val_Recall: 0.9144 - val_loss: 0.5584
Epoch 6/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8696 - loss: 0.9570 - val_Recall: 0.9099 - val_loss: 0.5244
Epoch 7/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8407 - loss: 0.9106 - val_Recall: 0.8919 - val_loss: 0.4965
Epoch 8/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8706 - loss: 0.8654 - val_Recall: 0.8919 - val_loss: 0.4663
Epoch 9/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8351 - loss: 0.8537 - val_Recall: 0.8964 - val_loss: 0.4466
Epoch 10/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8774 - loss: 0.8197 - val_Recall: 0.8964 - val_loss: 0.4244
Epoch 11/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8590 - loss: 0.7838 - val_Recall: 0.8874 - val_loss: 0.4010
Epoch 12/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8770 - loss: 0.7348 - val_Recall: 0.8874 - val_loss: 0.3782
Epoch 13/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - Recall: 0.8855 - loss: 0.7406 - val_Recall: 0.8829 - val_loss: 0.3612
Epoch 14/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8468 - loss: 0.7132 - val_Recall: 0.8874 - val_loss: 0.3535
Epoch 15/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8621 - loss: 0.6826 - val_Recall: 0.8784 - val_loss: 0.3342
Epoch 16/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8418 - loss: 0.7101 - val_Recall: 0.8829 - val_loss: 0.3276
Epoch 17/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8605 - loss: 0.6643 - val_Recall: 0.8784 - val_loss: 0.3141
Epoch 18/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8624 - loss: 0.6536 - val_Recall: 0.8739 - val_loss: 0.3017
Epoch 19/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8733 - loss: 0.6535 - val_Recall: 0.8649 - val_loss: 0.2904
Epoch 20/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8433 - loss: 0.6350 - val_Recall: 0.8694 - val_loss: 0.2868
Epoch 21/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8708 - loss: 0.6280 - val_Recall: 0.8694 - val_loss: 0.2815
Epoch 22/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8684 - loss: 0.6244 - val_Recall: 0.8694 - val_loss: 0.2785
Epoch 23/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8767 - loss: 0.6117 - val_Recall: 0.8694 - val_loss: 0.2685
Epoch 24/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8587 - loss: 0.6063 - val_Recall: 0.8694 - val_loss: 0.2672
Epoch 25/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8584 - loss: 0.5946 - val_Recall: 0.8649 - val_loss: 0.2596
Epoch 26/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8691 - loss: 0.5970 - val_Recall: 0.8694 - val_loss: 0.2543
Epoch 27/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8677 - loss: 0.5785 - val_Recall: 0.8694 - val_loss: 0.2475
Epoch 28/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8522 - loss: 0.5776 - val_Recall: 0.8694 - val_loss: 0.2430
Epoch 29/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8716 - loss: 0.5719 - val_Recall: 0.8694 - val_loss: 0.2406
Epoch 30/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8587 - loss: 0.5681 - val_Recall: 0.8694 - val_loss: 0.2385
Epoch 31/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8580 - loss: 0.5793 - val_Recall: 0.8694 - val_loss: 0.2333
Epoch 32/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8756 - loss: 0.5398 - val_Recall: 0.8694 - val_loss: 0.2290
Epoch 33/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.8799 - loss: 0.5356 - val_Recall: 0.8694 - val_loss: 0.2267
Epoch 34/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8629 - loss: 0.5420 - val_Recall: 0.8694 - val_loss: 0.2242
Epoch 35/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8589 - loss: 0.5693 - val_Recall: 0.8694 - val_loss: 0.2204
Epoch 36/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8659 - loss: 0.5296 - val_Recall: 0.8694 - val_loss: 0.2134
Epoch 37/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8714 - loss: 0.5428 - val_Recall: 0.8694 - val_loss: 0.2141
Epoch 38/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8752 - loss: 0.5285 - val_Recall: 0.8694 - val_loss: 0.2101
Epoch 39/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8858 - loss: 0.5103 - val_Recall: 0.8694 - val_loss: 0.2073
Epoch 40/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8612 - loss: 0.5140 - val_Recall: 0.8694 - val_loss: 0.2029
Epoch 41/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8769 - loss: 0.5207 - val_Recall: 0.8694 - val_loss: 0.2046
Epoch 42/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8629 - loss: 0.5124 - val_Recall: 0.8694 - val_loss: 0.1975
Epoch 43/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8793 - loss: 0.5029 - val_Recall: 0.8694 - val_loss: 0.1995
Epoch 44/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8605 - loss: 0.5181 - val_Recall: 0.8739 - val_loss: 0.1958
Epoch 45/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8602 - loss: 0.5174 - val_Recall: 0.8739 - val_loss: 0.1922
Epoch 46/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8831 - loss: 0.4971 - val_Recall: 0.8739 - val_loss: 0.1929
Epoch 47/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8786 - loss: 0.5033 - val_Recall: 0.8739 - val_loss: 0.1911
Epoch 48/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8782 - loss: 0.4933 - val_Recall: 0.8739 - val_loss: 0.1884
Epoch 49/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8789 - loss: 0.5071 - val_Recall: 0.8739 - val_loss: 0.1883
Epoch 50/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8671 - loss: 0.4853 - val_Recall: 0.8739 - val_loss: 0.1843
In [ ]:
print("Time taken in seconds ",end - start)
Time taken in seconds  73.65157866477966
In [ ]:
plot(history,'loss', model_13_config)
No description has been provided for this image
In [ ]:
plot(history,'Recall', model_13_config)
No description has been provided for this image
In [ ]:
model_13_train_perf = model_performance_classification(model_11,X_train,y_train)
model_13_train_perf
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.98575 0.955358 0.916728 0.935058
In [ ]:
model_13_val_perf = model_performance_classification(model_13,X_val,y_val)
model_13_val_perf
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.962 0.920526 0.801229 0.849072
In [ ]:
# Evaluate model performance on the test set and visualize the confusion matrix.
y_test_pred = model_13.predict(X_test)
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for Model 13 \n" + model_13_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
[[4564  154]
 [  42  240]]
No description has been provided for this image

Observation

  • Adjusting the learning rate has helped to reduce oscillations in the loss curves, contributing to smoother convergence.
  • However, the gap between training and validation recall has increased along with increase in number of false negatives on test set.

6.17 - Model 14 - Introduce additional dropout layer¶

  • Model Configuration
    • Add a Dropout layer after the second hidden layer to observe performance
In [ ]:
# Define model configuration string to use in the title of graphs
model_14_config = "(1HL,21N,leaky_relu,he_normal + DR.4 + 2HL,14N,relu,he_normal + DR.3 + 3HL,7N,relu \n adam,,lr.0001 \n bs32, 50E) "
In [ ]:
# Clear the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [ ]:
# Initializing the sequential model
model_14 = Sequential()

# Add the first hidden layer with 7 neurons, ReLU activation, and input dimension matching the number of features in X_train
model_14.add(Dense(21, activation="leaky_relu",kernel_initializer='he_normal', input_dim=X_train.shape[1]))

# Dropout 40%
model_14.add(Dropout(0.4))

# Additional hidden layer with 14 neurons, ReLU activation
model_14.add(Dense(14, activation="relu", kernel_initializer='he_normal'))

# Dropout 30%
model_14.add(Dropout(0.3))

# Additional hidden layer with 7 neurons, ReLU activation
model_14.add(Dense(7, activation="relu"))

# Add the output layer with 1 neuron and sigmoid activation for binary classification probability output
model_14.add(Dense(1, activation="sigmoid"))
In [ ]:
model_14.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 21)             │           861 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 21)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 14)             │           308 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_1 (Dropout)             │ (None, 14)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 7)              │           105 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_3 (Dense)                 │ (None, 1)              │             8 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 1,282 (5.01 KB)
 Trainable params: 1,282 (5.01 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Define Adam as the optimizer to be used
optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001)

# Recall is the chosen metric to measure
model_14.compile(loss='binary_crossentropy', optimizer=optimizer, metrics = ['Recall'])
In [ ]:
start = time.time()
history = model_14.fit(X_train, y_train, validation_data=(X_val,y_val) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - Recall: 0.1600 - loss: 2.3388 - val_Recall: 0.2477 - val_loss: 0.5404
Epoch 2/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.2922 - loss: 1.7231 - val_Recall: 0.6532 - val_loss: 0.5852
Epoch 3/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.4569 - loss: 1.4379 - val_Recall: 0.7703 - val_loss: 0.6022
Epoch 4/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.5784 - loss: 1.3813 - val_Recall: 0.8559 - val_loss: 0.6056
Epoch 5/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.6574 - loss: 1.2792 - val_Recall: 0.9099 - val_loss: 0.6102
Epoch 6/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.7481 - loss: 1.2078 - val_Recall: 0.9189 - val_loss: 0.6020
Epoch 7/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.7621 - loss: 1.1672 - val_Recall: 0.9054 - val_loss: 0.5904
Epoch 8/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.7754 - loss: 1.1419 - val_Recall: 0.9144 - val_loss: 0.5800
Epoch 9/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8010 - loss: 1.0704 - val_Recall: 0.9054 - val_loss: 0.5587
Epoch 10/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8173 - loss: 1.0513 - val_Recall: 0.9099 - val_loss: 0.5470
Epoch 11/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8270 - loss: 1.0288 - val_Recall: 0.9099 - val_loss: 0.5262
Epoch 12/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8415 - loss: 1.0016 - val_Recall: 0.9054 - val_loss: 0.5098
Epoch 13/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8630 - loss: 0.9535 - val_Recall: 0.9054 - val_loss: 0.4960
Epoch 14/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8284 - loss: 0.9467 - val_Recall: 0.9009 - val_loss: 0.4888
Epoch 15/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8474 - loss: 0.9169 - val_Recall: 0.9054 - val_loss: 0.4753
Epoch 16/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8592 - loss: 0.8830 - val_Recall: 0.8919 - val_loss: 0.4648
Epoch 17/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8733 - loss: 0.8636 - val_Recall: 0.8919 - val_loss: 0.4521
Epoch 18/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.8573 - loss: 0.8503 - val_Recall: 0.8919 - val_loss: 0.4477
Epoch 19/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8635 - loss: 0.8189 - val_Recall: 0.8919 - val_loss: 0.4308
Epoch 20/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8717 - loss: 0.7893 - val_Recall: 0.8874 - val_loss: 0.4183
Epoch 21/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8401 - loss: 0.8306 - val_Recall: 0.8874 - val_loss: 0.4112
Epoch 22/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8545 - loss: 0.8002 - val_Recall: 0.8874 - val_loss: 0.4062
Epoch 23/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8223 - loss: 0.8126 - val_Recall: 0.8874 - val_loss: 0.4004
Epoch 24/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8679 - loss: 0.7714 - val_Recall: 0.8874 - val_loss: 0.3892
Epoch 25/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8467 - loss: 0.7756 - val_Recall: 0.8874 - val_loss: 0.3848
Epoch 26/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8309 - loss: 0.7737 - val_Recall: 0.8874 - val_loss: 0.3791
Epoch 27/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8460 - loss: 0.7379 - val_Recall: 0.8874 - val_loss: 0.3736
Epoch 28/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.8613 - loss: 0.7295 - val_Recall: 0.8874 - val_loss: 0.3690
Epoch 29/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8619 - loss: 0.7136 - val_Recall: 0.8694 - val_loss: 0.3574
Epoch 30/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8433 - loss: 0.7132 - val_Recall: 0.8694 - val_loss: 0.3559
Epoch 31/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8637 - loss: 0.7022 - val_Recall: 0.8784 - val_loss: 0.3556
Epoch 32/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8442 - loss: 0.7096 - val_Recall: 0.8784 - val_loss: 0.3495
Epoch 33/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8545 - loss: 0.7072 - val_Recall: 0.8784 - val_loss: 0.3508
Epoch 34/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8419 - loss: 0.6880 - val_Recall: 0.8694 - val_loss: 0.3426
Epoch 35/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8626 - loss: 0.6664 - val_Recall: 0.8694 - val_loss: 0.3374
Epoch 36/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8626 - loss: 0.6610 - val_Recall: 0.8694 - val_loss: 0.3325
Epoch 37/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8454 - loss: 0.6705 - val_Recall: 0.8694 - val_loss: 0.3313
Epoch 38/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8682 - loss: 0.6962 - val_Recall: 0.8694 - val_loss: 0.3273
Epoch 39/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8642 - loss: 0.6613 - val_Recall: 0.8649 - val_loss: 0.3226
Epoch 40/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8671 - loss: 0.6485 - val_Recall: 0.8649 - val_loss: 0.3187
Epoch 41/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8532 - loss: 0.6530 - val_Recall: 0.8604 - val_loss: 0.3162
Epoch 42/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8534 - loss: 0.6271 - val_Recall: 0.8649 - val_loss: 0.3127
Epoch 43/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8456 - loss: 0.6386 - val_Recall: 0.8649 - val_loss: 0.3118
Epoch 44/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - Recall: 0.8497 - loss: 0.6385 - val_Recall: 0.8649 - val_loss: 0.3077
Epoch 45/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8480 - loss: 0.6159 - val_Recall: 0.8649 - val_loss: 0.3009
Epoch 46/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8404 - loss: 0.6364 - val_Recall: 0.8649 - val_loss: 0.3032
Epoch 47/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8491 - loss: 0.6341 - val_Recall: 0.8604 - val_loss: 0.3000
Epoch 48/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8633 - loss: 0.6077 - val_Recall: 0.8604 - val_loss: 0.2994
Epoch 49/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8458 - loss: 0.6051 - val_Recall: 0.8559 - val_loss: 0.2907
Epoch 50/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8600 - loss: 0.5851 - val_Recall: 0.8559 - val_loss: 0.2858
In [ ]:
print("Time taken in seconds ",end - start)
Time taken in seconds  75.48225545883179
In [ ]:
plot(history,'loss', model_14_config)
No description has been provided for this image
In [ ]:
plot(history,'Recall', model_14_config)
No description has been provided for this image
In [ ]:
model_14_train_perf = model_performance_classification(model_14,X_train,y_train)
model_14_train_perf
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.93425 0.912196 0.72293 0.78194
In [ ]:
model_14_val_perf = model_performance_classification(model_14,X_val,y_val)
model_14_val_perf
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.931 0.895636 0.714407 0.770843
In [ ]:
# Evaluate model performance on the test set and visualize the confusion matrix.
y_test_pred = model_14.predict(X_test)
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for Model 14 \n" + model_14_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
[[4447  271]
 [  48  234]]
No description has been provided for this image

Observation

  • Droput layer has helped to reduce the overfitting
  • The Recall metric on both training and validation sets started to dip
  • False Negatives observed on the test set increased.

6.18 - Model 15 - Add fourth hidden layer¶

  • Model Configuration
    • Add a fourth hidden layer with 5 neurons and sigmoid activation to evaluate the impact on model performance.
In [ ]:
# Define model configuration string to use in the title of graphs
model_15_config = "(1HL,21N,leaky_relu,he_normal + DR.4 + 2HL,14N,relu,he_normal + 3HL,7N,relu + 4HL,5N,sigmoid \n adam \n bs32,50E) "
In [ ]:
# Clear the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [ ]:
# Initializing the sequential model
model_15 = Sequential()

# Add the first hidden layer with 7 neurons, ReLU activation, and input dimension matching the number of features in X_train
model_15.add(Dense(21, activation="leaky_relu",kernel_initializer='he_normal', input_dim=X_train.shape[1]))

# Dropout 40%
model_15.add(Dropout(0.4))

# Additional hidden layer with 14 neurons, ReLU activation
model_15.add(Dense(14, activation="relu", kernel_initializer='he_normal'))

# Additional hidden layer with 7 neurons, ReLU activation
model_15.add(Dense(7, activation="relu"))

# Additional hidden layer with 5 neurons, sigmoid activation
model_15.add(Dense(5, activation="sigmoid"))

# Add the output layer with 1 neuron and sigmoid activation for binary classification probability output
model_15.add(Dense(1, activation="sigmoid"))
In [ ]:
model_15.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 21)             │           861 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 21)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 14)             │           308 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 7)              │           105 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_3 (Dense)                 │ (None, 5)              │            40 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_4 (Dense)                 │ (None, 1)              │             6 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 1,320 (5.16 KB)
 Trainable params: 1,320 (5.16 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Define Adam as the optimizer to be used
optimizer = tf.keras.optimizers.Adam()

# Recall is the chosen metric to measure
model_15.compile(loss='binary_crossentropy', optimizer=optimizer, metrics = ['Recall'])
In [ ]:
start = time.time()
history = model_15.fit(X_train, y_train, validation_data=(X_val,y_val) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - Recall: 0.9039 - loss: 1.2544 - val_Recall: 0.8784 - val_loss: 0.4286
Epoch 2/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8634 - loss: 0.8213 - val_Recall: 0.8874 - val_loss: 0.3077
Epoch 3/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step - Recall: 0.8648 - loss: 0.6981 - val_Recall: 0.8784 - val_loss: 0.2740
Epoch 4/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8697 - loss: 0.6328 - val_Recall: 0.8694 - val_loss: 0.2398
Epoch 5/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8816 - loss: 0.5779 - val_Recall: 0.8739 - val_loss: 0.2437
Epoch 6/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8764 - loss: 0.5764 - val_Recall: 0.8694 - val_loss: 0.2136
Epoch 7/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8694 - loss: 0.5440 - val_Recall: 0.8739 - val_loss: 0.2093
Epoch 8/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8687 - loss: 0.5173 - val_Recall: 0.8829 - val_loss: 0.1987
Epoch 9/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8869 - loss: 0.4794 - val_Recall: 0.8739 - val_loss: 0.1707
Epoch 10/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8853 - loss: 0.4659 - val_Recall: 0.8739 - val_loss: 0.1640
Epoch 11/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8710 - loss: 0.4813 - val_Recall: 0.8829 - val_loss: 0.1721
Epoch 12/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8886 - loss: 0.4590 - val_Recall: 0.8874 - val_loss: 0.1614
Epoch 13/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8760 - loss: 0.4545 - val_Recall: 0.8874 - val_loss: 0.1606
Epoch 14/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8772 - loss: 0.4469 - val_Recall: 0.8874 - val_loss: 0.1748
Epoch 15/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8909 - loss: 0.4262 - val_Recall: 0.8874 - val_loss: 0.1629
Epoch 16/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8916 - loss: 0.4273 - val_Recall: 0.8829 - val_loss: 0.1562
Epoch 17/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8957 - loss: 0.4207 - val_Recall: 0.8874 - val_loss: 0.1567
Epoch 18/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8790 - loss: 0.4365 - val_Recall: 0.8784 - val_loss: 0.1515
Epoch 19/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8749 - loss: 0.4266 - val_Recall: 0.8874 - val_loss: 0.1477
Epoch 20/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8788 - loss: 0.4264 - val_Recall: 0.8829 - val_loss: 0.1631
Epoch 21/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - Recall: 0.8998 - loss: 0.4088 - val_Recall: 0.8829 - val_loss: 0.1486
Epoch 22/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.8945 - loss: 0.4028 - val_Recall: 0.8784 - val_loss: 0.1388
Epoch 23/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 2ms/step - Recall: 0.8971 - loss: 0.3841 - val_Recall: 0.8874 - val_loss: 0.1338
Epoch 24/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8899 - loss: 0.3969 - val_Recall: 0.8874 - val_loss: 0.1429
Epoch 25/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8860 - loss: 0.4126 - val_Recall: 0.8829 - val_loss: 0.1353
Epoch 26/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8921 - loss: 0.3847 - val_Recall: 0.8874 - val_loss: 0.1495
Epoch 27/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9091 - loss: 0.3786 - val_Recall: 0.8829 - val_loss: 0.1288
Epoch 28/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8854 - loss: 0.3970 - val_Recall: 0.8784 - val_loss: 0.1350
Epoch 29/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 5ms/step - Recall: 0.9007 - loss: 0.3877 - val_Recall: 0.8829 - val_loss: 0.1315
Epoch 30/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8962 - loss: 0.3848 - val_Recall: 0.8784 - val_loss: 0.1243
Epoch 31/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8893 - loss: 0.3771 - val_Recall: 0.8784 - val_loss: 0.1212
Epoch 32/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8955 - loss: 0.3737 - val_Recall: 0.8784 - val_loss: 0.1293
Epoch 33/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8877 - loss: 0.3812 - val_Recall: 0.8829 - val_loss: 0.1294
Epoch 34/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8966 - loss: 0.3808 - val_Recall: 0.8874 - val_loss: 0.1414
Epoch 35/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8930 - loss: 0.3795 - val_Recall: 0.8829 - val_loss: 0.1295
Epoch 36/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9047 - loss: 0.3506 - val_Recall: 0.8829 - val_loss: 0.1292
Epoch 37/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.9036 - loss: 0.3616 - val_Recall: 0.8784 - val_loss: 0.1221
Epoch 38/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step - Recall: 0.8997 - loss: 0.3771 - val_Recall: 0.8784 - val_loss: 0.1257
Epoch 39/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8863 - loss: 0.3844 - val_Recall: 0.8784 - val_loss: 0.1259
Epoch 40/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9008 - loss: 0.3609 - val_Recall: 0.8784 - val_loss: 0.1257
Epoch 41/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8891 - loss: 0.3715 - val_Recall: 0.8784 - val_loss: 0.1289
Epoch 42/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9024 - loss: 0.3670 - val_Recall: 0.8739 - val_loss: 0.1260
Epoch 43/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8945 - loss: 0.3655 - val_Recall: 0.8694 - val_loss: 0.1234
Epoch 44/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8938 - loss: 0.3619 - val_Recall: 0.8784 - val_loss: 0.1238
Epoch 45/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.9039 - loss: 0.3562 - val_Recall: 0.8739 - val_loss: 0.1303
Epoch 46/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.9061 - loss: 0.3636 - val_Recall: 0.8649 - val_loss: 0.1180
Epoch 47/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.9008 - loss: 0.3524 - val_Recall: 0.8694 - val_loss: 0.1189
Epoch 48/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.9038 - loss: 0.3488 - val_Recall: 0.8739 - val_loss: 0.1306
Epoch 49/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8970 - loss: 0.3474 - val_Recall: 0.8649 - val_loss: 0.1152
Epoch 50/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8979 - loss: 0.3558 - val_Recall: 0.8739 - val_loss: 0.1185
In [ ]:
print("Time taken in seconds ",end - start)
Time taken in seconds  78.96769118309021
In [ ]:
plot(history,'loss', model_15_config)
No description has been provided for this image
In [ ]:
plot(history,'Recall', model_15_config)
No description has been provided for this image
In [ ]:
model_15_train_perf = model_performance_classification(model_15,X_train,y_train)
model_15_train_perf
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step  
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.988688 0.946313 0.945841 0.946077
In [ ]:
model_15_val_perf = model_performance_classification(model_15,X_val,y_val)
model_15_val_perf
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step  
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.98575 0.933099 0.931271 0.932183
In [ ]:
# Evaluate model performance on the test set and visualize the confusion matrix.
y_test_pred = model_15.predict(X_test)
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for Model 15 \n" + model_15_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
[[4690   28]
 [  38  244]]
No description has been provided for this image

Observation

  • Adding a fourth hidden layer has helped redcue false negatives

6.19 - Model 16 - Change activation function of the first hidden layer¶

  • Model Configuration
    • Change the activation function of first hidden layer to sigmoid and observe the performance
In [ ]:
# Define model configuration string to use in the title of graphs
model_16_config = "(1HL,21N,sigmoid + DR.4 + 2HL,14N,relu,he_normal + 3HL,7N,relu + 4HL,5N,sigmoid \n adam \n bs32,50E) "
In [ ]:
# Clear the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [ ]:
# Initializing the sequential model
model_16 = Sequential()

# Add the first hidden layer with 7 neurons, ReLU activation, and input dimension matching the number of features in X_train
model_16.add(Dense(21, activation="sigmoid", input_dim=X_train.shape[1]))

# Dropout 40%
model_16.add(Dropout(0.4))

# Additional hidden layer with 14 neurons, ReLU activation
model_16.add(Dense(14, activation="relu", kernel_initializer='he_normal'))

# Additional hidden layer with 7 neurons, ReLU activation
model_16.add(Dense(7, activation="relu"))

# Additional hidden layer with 5 neurons, sigmoid activation
model_16.add(Dense(5, activation="sigmoid"))

# Add the output layer with 1 neuron and sigmoid activation for binary classification probability output
model_16.add(Dense(1, activation="sigmoid"))
In [ ]:
model_16.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 21)             │           861 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 21)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 14)             │           308 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 7)              │           105 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_3 (Dense)                 │ (None, 5)              │            40 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_4 (Dense)                 │ (None, 1)              │             6 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 1,320 (5.16 KB)
 Trainable params: 1,320 (5.16 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Define Adam as the optimizer to be used
optimizer = tf.keras.optimizers.Adam()

# Recall is the chosen metric to measure
model_16.compile(loss='binary_crossentropy', optimizer=optimizer, metrics = ['Recall'])
In [ ]:
start = time.time()
history = model_16.fit(X_train, y_train, validation_data=(X_val,y_val) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - Recall: 0.1892 - loss: 1.4087 - val_Recall: 0.8964 - val_loss: 0.4305
Epoch 2/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8490 - loss: 0.8693 - val_Recall: 0.8694 - val_loss: 0.3544
Epoch 3/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8622 - loss: 0.7732 - val_Recall: 0.8514 - val_loss: 0.3098
Epoch 4/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8568 - loss: 0.7250 - val_Recall: 0.8514 - val_loss: 0.3278
Epoch 5/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8652 - loss: 0.6840 - val_Recall: 0.8468 - val_loss: 0.3062
Epoch 6/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8576 - loss: 0.6778 - val_Recall: 0.8423 - val_loss: 0.2819
Epoch 7/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8527 - loss: 0.6514 - val_Recall: 0.8514 - val_loss: 0.2928
Epoch 8/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8617 - loss: 0.6390 - val_Recall: 0.8423 - val_loss: 0.2790
Epoch 9/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8619 - loss: 0.6122 - val_Recall: 0.8423 - val_loss: 0.2767
Epoch 10/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8470 - loss: 0.6315 - val_Recall: 0.8423 - val_loss: 0.2746
Epoch 11/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8485 - loss: 0.6095 - val_Recall: 0.8468 - val_loss: 0.2622
Epoch 12/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8542 - loss: 0.5878 - val_Recall: 0.8514 - val_loss: 0.2585
Epoch 13/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8592 - loss: 0.5906 - val_Recall: 0.8514 - val_loss: 0.2401
Epoch 14/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8602 - loss: 0.5768 - val_Recall: 0.8559 - val_loss: 0.2444
Epoch 15/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8547 - loss: 0.5609 - val_Recall: 0.8559 - val_loss: 0.2153
Epoch 16/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8563 - loss: 0.5555 - val_Recall: 0.8604 - val_loss: 0.2095
Epoch 17/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8640 - loss: 0.5329 - val_Recall: 0.8604 - val_loss: 0.1946
Epoch 18/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8727 - loss: 0.5150 - val_Recall: 0.8694 - val_loss: 0.1865
Epoch 19/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8635 - loss: 0.5241 - val_Recall: 0.8694 - val_loss: 0.1857
Epoch 20/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8576 - loss: 0.5117 - val_Recall: 0.8649 - val_loss: 0.1703
Epoch 21/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8642 - loss: 0.4899 - val_Recall: 0.8694 - val_loss: 0.1707
Epoch 22/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8695 - loss: 0.4858 - val_Recall: 0.8694 - val_loss: 0.1643
Epoch 23/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8747 - loss: 0.4798 - val_Recall: 0.8694 - val_loss: 0.1547
Epoch 24/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.8576 - loss: 0.4822 - val_Recall: 0.8739 - val_loss: 0.1596
Epoch 25/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8670 - loss: 0.4744 - val_Recall: 0.8739 - val_loss: 0.1424
Epoch 26/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8671 - loss: 0.4686 - val_Recall: 0.8739 - val_loss: 0.1677
Epoch 27/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8656 - loss: 0.4680 - val_Recall: 0.8739 - val_loss: 0.1533
Epoch 28/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8537 - loss: 0.4742 - val_Recall: 0.8829 - val_loss: 0.1488
Epoch 29/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8764 - loss: 0.4466 - val_Recall: 0.8739 - val_loss: 0.1475
Epoch 30/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8772 - loss: 0.4351 - val_Recall: 0.8739 - val_loss: 0.1394
Epoch 31/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - Recall: 0.8636 - loss: 0.4494 - val_Recall: 0.8784 - val_loss: 0.1423
Epoch 32/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8561 - loss: 0.4478 - val_Recall: 0.8784 - val_loss: 0.1433
Epoch 33/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8706 - loss: 0.4339 - val_Recall: 0.8784 - val_loss: 0.1458
Epoch 34/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8625 - loss: 0.4369 - val_Recall: 0.8784 - val_loss: 0.1454
Epoch 35/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8769 - loss: 0.4319 - val_Recall: 0.8739 - val_loss: 0.1296
Epoch 36/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8681 - loss: 0.4411 - val_Recall: 0.8829 - val_loss: 0.1584
Epoch 37/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8621 - loss: 0.4462 - val_Recall: 0.8739 - val_loss: 0.1466
Epoch 38/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8791 - loss: 0.4049 - val_Recall: 0.8739 - val_loss: 0.1309
Epoch 39/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8691 - loss: 0.4310 - val_Recall: 0.8829 - val_loss: 0.1564
Epoch 40/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - Recall: 0.8904 - loss: 0.4056 - val_Recall: 0.8784 - val_loss: 0.1365
Epoch 41/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8783 - loss: 0.4142 - val_Recall: 0.8739 - val_loss: 0.1319
Epoch 42/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - Recall: 0.8731 - loss: 0.4206 - val_Recall: 0.8739 - val_loss: 0.1291
Epoch 43/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - Recall: 0.8765 - loss: 0.4133 - val_Recall: 0.8784 - val_loss: 0.1343
Epoch 44/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8763 - loss: 0.4249 - val_Recall: 0.8739 - val_loss: 0.1282
Epoch 45/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8793 - loss: 0.4039 - val_Recall: 0.8694 - val_loss: 0.1190
Epoch 46/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8858 - loss: 0.3983 - val_Recall: 0.8739 - val_loss: 0.1206
Epoch 47/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8970 - loss: 0.3760 - val_Recall: 0.8739 - val_loss: 0.1191
Epoch 48/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8826 - loss: 0.4015 - val_Recall: 0.8739 - val_loss: 0.1285
Epoch 49/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - Recall: 0.8737 - loss: 0.4131 - val_Recall: 0.8784 - val_loss: 0.1326
Epoch 50/50
500/500 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - Recall: 0.8802 - loss: 0.4217 - val_Recall: 0.8784 - val_loss: 0.1222
In [ ]:
print("Time taken in seconds ",end - start)
Time taken in seconds  74.32506060600281
In [ ]:
plot(history,'loss', model_16_config)
No description has been provided for this image
In [ ]:
plot(history,'Recall', model_16_config)
No description has been provided for this image
In [ ]:
model_16_train_perf = model_performance_classification(model_16,X_train,y_train)
model_16_train_perf
500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.989812 0.945849 0.956014 0.950867
In [ ]:
model_16_val_perf = model_performance_classification(model_16,X_val,y_val)
model_16_val_perf
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.988 0.93641 0.947821 0.942032
In [ ]:
# Evaluate model performance on the test set and visualize the confusion matrix.
y_test_pred = model_16.predict(X_test)
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for Model 16 \n" + model_16_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
[[4701   17]
 [  44  238]]
No description has been provided for this image

Observation

  • The recall metric on train and val is very close suggesting good generalization.
  • However, overall recall is more or less in the same range as observed in few models earlier
  • False negatives has increased on test data set

7 - Model Performance Comparison and Final Model Selection¶

  • Compare the performances of all the models for the training and validation sets.
  • Choose the best model.
In [ ]:
# Create a dataframe of train set performaces

models_train_comp_df = pd.concat(
    [
        model_0_train_perf.T,
        model_1_train_perf.T,
        model_2_train_perf.T,
        model_3_train_perf.T,
        model_4_train_perf.T,
        model_5_train_perf.T,
        model_6_train_perf.T,
        model_7_train_perf.T,
        model_8_train_perf.T,
        model_9_train_perf.T,
        model_10_train_perf.T,
        model_11_train_perf.T,
        model_12_train_perf.T,
        model_13_train_perf.T,
        model_14_train_perf.T,
        model_15_train_perf.T,
        model_16_train_perf.T
    ],
    axis=1,
)
models_train_comp_df.columns = [
    "Model 0",
    "Model 1",
    "Model 2",
    "Model 3",
    "Model 4",
    "Model 5",
    "Model 6",
    "Model 7",
    "Model 8",
    "Model 9",
    "Model 10",
    "Model 11",
    "Model 12",
    "Model 13",
    "Model 14",
    "Model 15",
    "Model 16"
]

Validation Performance Comparison

In [ ]:
# Create a dataframe of val set performaces
models_val_comp_df = pd.concat(
    [
        model_0_val_perf.T,
        model_1_val_perf.T,
        model_2_val_perf.T,
        model_3_val_perf.T,
        model_4_val_perf.T,
        model_5_val_perf.T,
        model_6_val_perf.T,
        model_7_val_perf.T,
        model_8_val_perf.T,
        model_9_val_perf.T,
        model_10_val_perf.T,
        model_11_val_perf.T,
        model_12_val_perf.T,
        model_13_val_perf.T,
        model_14_val_perf.T,
        model_15_val_perf.T,
        model_16_val_perf.T
    ],
    axis=1,
)
models_val_comp_df.columns = [
    "Model 0",
    "Model 1",
    "Model 2",
    "Model 3",
    "Model 4",
    "Model 5",
    "Model 6",
    "Model 7",
    "Model 8",
    "Model 9",
    "Model 10",
    "Model 11",
    "Model 12",
    "Model 13",
    "Model 14",
    "Model 15",
    "Model 16"
]
In [ ]:
print("Training set performance comparison:")
models_train_comp_df
Training set performance comparison:
Out[ ]:
Model 0 Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Model 8 Model 9 Model 10 Model 11 Model 12 Model 13 Model 14 Model 15 Model 16
Accuracy 0.945438 0.955562 0.935187 0.961125 0.947500 0.970875 0.972063 0.976437 0.970375 0.982187 0.989938 0.985750 0.989500 0.985750 0.934250 0.988688 0.989812
Recall 0.925008 0.933018 0.919052 0.941792 0.933519 0.949073 0.955002 0.953608 0.953048 0.952412 0.950685 0.955358 0.940914 0.955358 0.912196 0.946313 0.945849
Precision 0.749345 0.778021 0.725615 0.796185 0.755054 0.834209 0.838366 0.860204 0.831200 0.894097 0.953087 0.916728 0.957439 0.916728 0.722930 0.945841 0.956014
F1 Score 0.808852 0.834986 0.785717 0.851689 0.815737 0.881674 0.886548 0.900625 0.880902 0.920928 0.951882 0.935058 0.949003 0.935058 0.781940 0.946077 0.950867
In [ ]:
print("Validation set performance comparison:")
models_val_comp_df
Validation set performance comparison:
Out[ ]:
Model 0 Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Model 8 Model 9 Model 10 Model 11 Model 12 Model 13 Model 14 Model 15 Model 16
Accuracy 0.950500 0.958250 0.937500 0.960250 0.950000 0.970500 0.967500 0.974750 0.963000 0.979000 0.990250 0.982250 0.987750 0.962000 0.931000 0.985750 0.988000
Recall 0.918678 0.924901 0.907556 0.928079 0.926893 0.937745 0.931917 0.946355 0.933775 0.933765 0.941841 0.935486 0.929918 0.920526 0.895636 0.933099 0.936410
Precision 0.762725 0.787146 0.729267 0.794025 0.761513 0.834868 0.822411 0.853583 0.803754 0.882356 0.963526 0.903181 0.951017 0.801229 0.714407 0.931271 0.947821
F1 Score 0.818843 0.839934 0.787096 0.846078 0.819820 0.878215 0.867729 0.893655 0.855033 0.906183 0.952388 0.918616 0.940180 0.849072 0.770843 0.932183 0.942032

Choose best model

  • Model 7, 10 and Model 11 shows comparable performance metrics
  • Model 11 exhibits overfitting. Training recall is 96.5% while validation recall is noticeably lower (93.3%), suggesting not good generalising ability.
  • Model 7 and 10 shows robust generalization. Training recall of Model 7 is 95.4% and validation recall is 94.6%. The training recall of Model 10 is 95.1% and validation recall is 94.2%. The minimal generalization gap between the two sets of metrics indicates a well-regularized and stable performance for recall.
  • Amongst, Model 7 and Model 11, Model 7 shows marginally better generalization and also less false negatives. Hence, Model 7 is chosen as best model in this exercise.
In [ ]:
# Model_7 shows the highest validation recall (94.63%) amongst all models.
# The corresponding training recall is 95.36%, indicating a minimal recall gap (0.73%).
# This suggests good generalization and a low risk of overfitting.
# Selecting Model_7 as the final best model.
best_model = model_7
In [ ]:
# Test set performance for the best model
best_model_test_perf = model_performance_classification(best_model, X_test, y_test)
best_model_test_perf
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
Out[ ]:
Accuracy Recall Precision F1 Score
0 0.9692 0.921998 0.834409 0.872205
In [ ]:
# Derive classification_report
y_test_pred_best = best_model.predict(X_test)

# Check the classification report of best model on test data.
cr_test_best_model = classification_report(y_test, y_test_pred_best>0.5)
print(cr_test_best_model)
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
              precision    recall  f1-score   support

         0.0       0.99      0.98      0.98      4718
         1.0       0.68      0.87      0.76       282

    accuracy                           0.97      5000
   macro avg       0.83      0.92      0.87      5000
weighted avg       0.97      0.97      0.97      5000

In [ ]:
# Evaluate best model (Model_7) performance on the test set and visualize the confusion matrix.
y_test_pred = best_model.predict(X_test)
for i in range(len(y_test)):
    if y_test_pred[i]>0.5:
        y_test_pred[i]=1
    else:
        y_test_pred[i]=0

cm2=confusion_matrix(y_test, y_test_pred)
make_confusion_matrix(
    cm2,
    cmap='plasma',
    title="Confusion Matrix for best model (Model 7) \n" + model_7_config
    )
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
[[4601  117]
 [  37  245]]
No description has been provided for this image

8 - Actionable Insights and Recommendations¶

Actionable insighs

  1. Model 7 demonstrates good predictive capability with 94.6% recall on validation data, 96.9% accuracy on test data, successfully identifying 92% of actual generator failures. This model has strong generalization ability (compared to other models verified in this particular exercise) confirmed with minimal performance gap between training recall and validation recall.

  2. Model optimization opportunities exist to potentially improve recall beyond the current 92%. ReneWind could invest computation and effort to further tune hyperparameters to see if recall improves.

  3. Shifting the focus from reactive to predictive maintenance approach, ReneWind should proceed to deploy Model 7 in production and start optimizing cost through better repair vs replacement decisions.

  4. ReneWind can establish real-time monitoring dashboard to track model predictions and trigger maintenance workflows with alert system for high-risk generators flagged by the model to ensure rapid response.

Business Recommendations

  1. Keeping the cost hierarchy in context (Replacement > Repair > Inspection), the business can significantly reduce replacement costs by catching failures early through predictive maintenance, thereby saving substantial amounts for the company. With the model identifying 86.9% of actual failures before they occur, ReneWind can shift from expensive emergency replacements to planned, cost-effective repairs.

  2. This predictive approach will also assist in reducing downtime costs and revenue loss by enabling proactive maintenance scheduling during planned windows rather than unexpected equipment failures that cause unplanned outages and production losses

In [ ]:
!jupyter nbconvert --to html "/content/drive/MyDrive/Colab Notebooks/Project-4/renewind_nn.ipynb"